1 What is an AI Agent?

Before we write any code, let’s understand what we’re building.

1.1 The Simplest Definition

An AI Agent is a program where an LLM controls the execution flow.

Traditional Program: Human writes logic → Program executes
AI Agent:           Human sets goal → LLM decides what to do

The key difference: in traditional programming, you specify how to solve a problem. With agents, you specify what you want, and the LLM figures out the how.

1.2 Why Build Your Own Framework?

Existing frameworks like LangGraph, OpenAI Agents SDK, and Google Agent Development Kit are powerful, but:

Problem	Consequence
Too much abstraction	Hard to debug when things go wrong
Hidden complexity	Can’t customize core behavior
Frequent API changes	Code breaks between versions
Kitchen-sink design	90% of features you’ll never use

By building your own minimal framework, you:

Understand every line — No magic, no surprises
Customize freely — Change anything without fighting the framework
Stay lightweight — ~1000 lines vs 100,000+ lines
Learn deeply — Best way to understand agents is to build one

1.3 Key Components

Every agent has four essential parts:

Brain — The LLM that makes decisions
Instructions — System prompt defining behavior
Tools — Functions the agent can call
Memory — Conversation history and context

flowchart LR
    User[User] --> Agent
    Agent --> LLM[Brain/LLM]
    LLM --> Tools
    Tools --> LLM
    LLM --> Agent
    Agent --> User

1.4 The Agent Loop

The core pattern is surprisingly simple:

while True:
    # 1. Send conversation to LLM
    response = llm.complete(messages, tools)

    # 2. If LLM wants to use a tool
    if response.has_tool_calls:
        for tool_call in response.tool_calls:
            result = execute_tool(tool_call)
            messages.append(result)
        continue  # Go back to LLM with results

    # 3. Otherwise, return the final answer
    return response.content

That’s it. Everything else is details.

1.5 What Makes Agents Powerful

Unlike traditional chatbots, agents can:

Take actions — Search the web, write files, call APIs
Iterate — Try something, observe the result, adjust
Compose — Break complex tasks into steps
Delegate — Hand off to specialized sub-agents

1.6 A Concrete Example

Imagine asking: “What’s the weather in New York and should I bring an umbrella?”

Chatbot response:

“I don’t have access to real-time weather data.”

Agent response:

Calls get_weather("New York") → “72°F, 30% chance of rain”
Reasons about the result
Returns: “It’s 72°F in New York with a 30% chance of rain. A light umbrella might be useful, but it’s not essential.”

The agent acts on the world, not just generates text.

1.7 Our Goal

In this book, we’ll build a framework that enables all of this in ~1000 lines of Python:

Part I — Single agent with tools
Part II — Multi-agent handoffs and composition
Part III — Production features (observability, streaming, MCP)

No magic, no hidden complexity. Let’s start.