Insights

AI Agent Insights

Notes from the field on AI Agent reliability

Ryan Brandt

January 29, 2026·7 min read

The Mechanics of AI-First Engineering

Everyone's talking about 90% AI-written code. Here's what it actually looks like day to day.

AIClaude CodeEngineeringProductivityAutomation

Ryan Brandt

January 28, 2026·10 min read

An Enlarged Intimate Supplement to His Memory

How I built a local context lake that pulls from every conversation source automatically, and why Vannevar Bush's 1945 vision of the Memex finally makes sense.

CRMAI AgentsAutomationProductivity

Ryan Brandt

December 8, 2025·16 min read

AgentMail: Email Infrastructure for the Agentic Era

A deep dive into AgentMail's thesis that agents will become first-class internet users, with email as their primary communication protocol. Based on technical discussions with co-founder Adi Singh.

AI AgentsEmail InfrastructureAgent-to-AgentYCDeep Dive

Ryan Brandt

November 12, 2025·8 min read

Cursor: The Everything App

I wanted to interview at OpenAI but didn't know anyone there. So I built an agent in Cursor to optimize my cold outreach. That worked. Then I kept building. Now Cursor runs my entire life.

AICursorAutomationProductivity

Ryan Brandt

October 28, 2025·22 min read

Testing LangSmith's Insights Agent: 87.92% Coverage in 35 Minutes

We spent 20 hours with domain experts manually annotating 207 production agent traces to understand failure patterns. Then we tested if LangSmith's Insights Agent could automate this process. It found 87.92% of our failure patterns in 35 minutes.

AIEvalsLangSmithTestingAgent Engineering

Ryan Brandt

October 13, 2025·14 min read

The Unknown Unknowns Problem in AI Evaluation

Why automated tests miss the failures that matter most, and how manual error analysis discovers the bugs you never imagined existed.

AIEvalsTestingError AnalysisEngineering

Ryan Brandt

October 10, 2025·7 min read

The $500 AI That Just Beat Gemini at Abstract Reasoning

Samsung's 7-million parameter model outperforms giants on ARC-AGI 2. As the lead contributor to that benchmark, here's why this matters and what it means for the future of AI.

AIMachine LearningReasoningEfficiencyResearch

Ryan Brandt

October 8, 2025·13 min read

How to Actually Evaluate Your LLM (And Stop Guessing)

A methodological walkthrough using a hypothetical customer service bot to show how to move from vibes-based evaluation to systematic, measurable improvements.

AIEvalsLLMProduct DesignEngineering

Ryan Brandt

July 29, 2025·5 min read

Prompting 101: How to Make a Good Prompt

A practical guide to writing clear, effective prompts that get consistent results from LLMs.

PromptsAI DevelopmentLLMTutorial

Ryan Brandt

July 25, 2025·8 min read

The Most Valuable Part of Evals Cannot Be Automated

A simple, non-technical guide to fixing AI agents by analyzing what went wrong, measuring the impact, and improving systematically.

EvalsAI DevelopmentAgentic WorkflowsDebugging Agents

Ryan Brandt

July 22, 2025·7 min read

Application-Centric Evals: Stop Playing Whack-a-Mole

How to ship something people trust, come back to, and pay for. Inspired by Hamel Husain and Shreya Shankar's course.

EvalsAI DevelopmentLLMProduct

Ryan Brandt

July 3, 2025·9 min read

How MCP actually works and why FastMCP is the easiest way to use it

Breaking down how the Model Context Protocol works, why it's structured the way it is, and why FastMCP is the best way to implement it in practice.

MCPAI DevelopmentProtocolFastMCPAI AgentsLangChain

Ryan Brandt

January 20, 2025·18 min read

Building High-Quality LLM Judges: A Data-Driven Approach with Claude Code

How we achieved 82% recall with only a 2% generalization gap through 10 iterations of systematic prompt engineering in a single afternoon.

AIEvalsLLMPrompt EngineeringClaude Code