Insights

AI Agent Insights

Notes from the field on AI Agent reliability

Ryan Brandt - Author
Ryan Brandt
January 29, 2026·7 min read

The Mechanics of AI-First Engineering

Everyone's talking about 90% AI-written code. Here's what it actually looks like day to day.

AIClaude CodeEngineeringProductivityAutomation
Ryan Brandt - Author
Ryan Brandt
January 28, 2026·10 min read

An Enlarged Intimate Supplement to His Memory

How I built a local context lake that pulls from every conversation source automatically, and why Vannevar Bush's 1945 vision of the Memex finally makes sense.

CRMAI AgentsAutomationProductivity
Ryan Brandt - Author
Ryan Brandt
December 8, 2025·16 min read

AgentMail: Email Infrastructure for the Agentic Era

A deep dive into AgentMail's thesis that agents will become first-class internet users, with email as their primary communication protocol. Based on technical discussions with co-founder Adi Singh.

AI AgentsEmail InfrastructureAgent-to-AgentYCDeep Dive
Ryan Brandt - Author
Ryan Brandt
November 12, 2025·8 min read

Cursor: The Everything App

I wanted to interview at OpenAI but didn't know anyone there. So I built an agent in Cursor to optimize my cold outreach. That worked. Then I kept building. Now Cursor runs my entire life.

AICursorAutomationProductivity
Ryan Brandt - Author
Ryan Brandt
October 28, 2025·22 min read

Testing LangSmith's Insights Agent: 87.92% Coverage in 35 Minutes

We spent 20 hours with domain experts manually annotating 207 production agent traces to understand failure patterns. Then we tested if LangSmith's Insights Agent could automate this process. It found 87.92% of our failure patterns in 35 minutes.

AIEvalsLangSmithTestingAgent Engineering
Ryan Brandt - Author
Ryan Brandt
October 13, 2025·14 min read

The Unknown Unknowns Problem in AI Evaluation

Why automated tests miss the failures that matter most, and how manual error analysis discovers the bugs you never imagined existed.

AIEvalsTestingError AnalysisEngineering
Ryan Brandt - Author
Ryan Brandt
October 10, 2025·7 min read

The $500 AI That Just Beat Gemini at Abstract Reasoning

Samsung's 7-million parameter model outperforms giants on ARC-AGI 2. As the lead contributor to that benchmark, here's why this matters and what it means for the future of AI.

AIMachine LearningReasoningEfficiencyResearch
Ryan Brandt - Author
Ryan Brandt
October 8, 2025·13 min read

How to Actually Evaluate Your LLM (And Stop Guessing)

A methodological walkthrough using a hypothetical customer service bot to show how to move from vibes-based evaluation to systematic, measurable improvements.

AIEvalsLLMProduct DesignEngineering
Ryan Brandt - Author
Ryan Brandt
July 29, 2025·5 min read

Prompting 101: How to Make a Good Prompt

A practical guide to writing clear, effective prompts that get consistent results from LLMs.

PromptsAI DevelopmentLLMTutorial
Ryan Brandt - Author
Ryan Brandt
July 25, 2025·8 min read

The Most Valuable Part of Evals Cannot Be Automated

A simple, non-technical guide to fixing AI agents by analyzing what went wrong, measuring the impact, and improving systematically.

EvalsAI DevelopmentAgentic WorkflowsDebugging Agents
Ryan Brandt - Author
Ryan Brandt
July 22, 2025·7 min read

Application-Centric Evals: Stop Playing Whack-a-Mole

How to ship something people trust, come back to, and pay for. Inspired by Hamel Husain and Shreya Shankar's course.

EvalsAI DevelopmentLLMProduct
Ryan Brandt - Author
Ryan Brandt
July 3, 2025·9 min read

How MCP actually works and why FastMCP is the easiest way to use it

Breaking down how the Model Context Protocol works, why it's structured the way it is, and why FastMCP is the best way to implement it in practice.

MCPAI DevelopmentProtocolFastMCPAI AgentsLangChain
Ryan Brandt - Author
Ryan Brandt
January 20, 2025·18 min read

Building High-Quality LLM Judges: A Data-Driven Approach with Claude Code

How we achieved 82% recall with only a 2% generalization gap through 10 iterations of systematic prompt engineering in a single afternoon.

AIEvalsLLMPrompt EngineeringClaude Code