About

Hi, I'm Qinglin Dong. I work on AI agents and evaluation systems.

This blog is where I share my thinking on building reliable AI agents, designing meaningful evaluations, and the challenges that come with making these systems work in practice.

I'm particularly interested in:

Agent architectures and tool use
Evaluation methodology for LLM-based systems
Bridging the gap between benchmarks and real-world performance
The craft of prompt engineering and system design

Find me on GitHub and Twitter/X.