I'm an engineer focused on making AI systems more useful and reliable in practice.
Currently I build developer tools and agent infrastructure. Most of my work sits at the intersection of large language models and real-world software engineering - figuring out how to make agents that actually work, fail gracefully, and compose well with existing systems.
I think a lot about the gap between "demo that works" and "system you'd trust in production." Concretely, that means:
- Foxl - a personal AI agent that sits between Claude Code and Claude Cowork. It has a UI, but it can also touch your filesystem, automate browsers, run terminals, and manage scheduled tasks. I built it because I wanted Claude to be more than a chat window but didn't want to live in the terminal either. Skills and agent harness are defined as markdown (SKILL.md), so teaching the agent a new capability is just writing a file. I use it for ~80-90% of my daily work.
- Agent architectures - multi-agent orchestration, tool use patterns, skills and harness design, and the surprisingly hard problem of getting agents to know when they don't know something. A lot of this thinking comes from building Foxl and shipping agent solutions for customers.
- Developer experience - I maintain Amazon Bedrock Client for Mac, a native macOS app for working with foundation models. Good tools change how people think about what's possible.
- Applied AI patterns - RAG pipelines, document processing, intelligent routing. The kind of work where the architecture matters more than the model.
I've also built workshops on agent frameworks, agent infrastructure, and MCP-based tooling
- mostly because I learn best by teaching.
Before this, I spent ~3 years as a Solutions Architect at AWS helping teams ship AI workloads, and 5 years as a software engineer at Samsung. The through-line is building systems that hold up under real usage.
I care about clean abstractions, honest benchmarks, and software that respects the people using it.
If you're building with Agents and hitting the messy parts - agent reliability, evaluation, production deployment - I'm always happy to compare notes.



