Build Your Own AI Coding Agent in a Weekend
Build a fully functional AI coding assistant from scratch — without writing a single line of code. Learn how agents actually work by building one.
TUTOR WITH THEFOCUS.AI
Copy this prompt into Claude, ChatGPT, or any external AI assistant. It points the assistant to the course instructions and links it to your student profile to track your progress and customize observations.
You are not enrolled yet. Enroll to generate a Student ID to track lesson completions and store learning notes.
Build Your Own AI Coding Agent in a Weekend
Learn how agents actually work by building one — using English prompts, not code.
3 parts · 11 chapters · Will Schenk
What You’ll Build
By the end of this tutorial, you’ll have a fully functional AI coding assistant that can:
- Navigate and understand your codebase
- Edit files with precision using structured diff tools
- Support user-defined custom skills to extend functionality
- Self-monitor the quality of its own codebase
- Generate images and videos
- Search the web for documentation and solutions
- Spawn specialized sub-agents for focused tasks
- Track costs so you don’t blow your API budget
- Log sessions for debugging and improvement
More importantly, you’ll understand why each piece exists and how they fit together. You’ll do it all just using English prompts — reading code is almost optional.
Why Build This?
There’s a fear out there about how intellectual work will change now that we have intelligence on tap. People have valid concerns, but most of the arguments are borne out of fear and ignorance. This tutorial is for everyone — regardless of coding background — to get hands-on experience with what these tools can literally do, and how that fits into your working process.
You’ll join me in step-by-step building of a piece of software, one that is smart enough to build itself. I give you the prompts — I thought up what to ask, I know what we’re going to build and how it’ll work. If you aren’t an experienced developer, you wouldn’t have known what to ask it. But you can follow along with what I’m doing, and you can do it yourself.
What’s exciting isn’t just how much easier my job is — how I can operate even closer to the things of thought-stuff — but that many more people can now operate closer to the poetry of coding.
These techniques will help you solve the sorts of problems that you needed to be a coder to solve. If those solutions delivered value before, now it will be in your power to make that value. And even if you don’t end up coding your whole world, this will help you develop some fingerspitzengefühl for what these new tools can do.
This tutorial peels back the layers. You’ll start with a 50-line bootstrap script that captures the core of what an agent is and then build everything out.
How Agents Actually Work
Before diving in, let’s demystify what’s happening under the hood. Every AI coding assistant — Cursor, Claude Code, Copilot, or what we’re building — follows the same pattern.
The Agent Loop
It’s a loop. You say something, the LLM responds. If it asks for a tool, you run it. The loop goes around and around.
- User enters input
- Send to the LLM
- LLM runs the simulation, returns tokens
- 3a. Includes tool call? Call the tool, go back to step #2
- Show the result to the user
- Go to step 1
That’s it. The “AI” part is the LLM. Everything else is plumbing.
The Harness: Prompts + Tools + Model
What we’re building is called a harness. It’s a combination of a prompt, tools, and a model. Think of it as three layers working together:
| Layer | What It Is | What It Does |
|---|---|---|
| Model | The LLM (Claude, GPT, Gemini) | The “brain” — understands intent, generates responses, decides what tools to call |
| Tools | Functions the model can invoke | The “hands” — read files, run commands, search the web |
| Prompts | Instructions shaping behavior | Clear directions on what you want the agent to do |
The model is powerful but blind — it can’t see your filesystem or run code. Tools give it capabilities. Prompts tell it how to use them wisely.
This combination is the harness. Swap the model, and the same harness behaves differently. Change the prompts, and the same model acts differently. Add tools, and new capabilities emerge.
Key Concepts
Tool Calling — Modern LLMs don’t just output text. They can output structured requests like “call the read_file function with path /src/index.ts”. Your code executes that function and sends the result back. The model never actually runs anything — it just asks.
Context Window — The LLM’s working memory. Every message, every tool result, every instruction competes for space in a fixed-size window (typically 128K–200K tokens). When you cross that limit, things get stupid. Managing context is the central challenge of agent design.
System Prompt — Instructions on what you want the agent to do. What sorts of responses are you looking for? Do you want it to be terse, or complete with responses that fit a clear template? A coding agent has multiple prompts for different needs.
Streaming — Instead of waiting for the full response, you receive tokens as they’re generated. This is why you see AI assistants “typing” — it’s not theater, it’s the actual generation happening in real time.
What Makes a Good Agent?
The model. 99% the model.
The differences are in the harness, and how cleverly you manage context:
- Good context — Making sure it understands your problem, where you are with it, what needs to happen
- Good tools — Safe, predictable, well-described so the model knows when to use them
- Good prompts — Clear instructions that guide without over-constraining
- Good observability — Logging and cost tracking so you know what’s happening
- Good architecture — Context management, subagents for complex tasks, extensibility
That’s what we’re building.
What You’ll Understand After
By the end of this tutorial, you won’t just have working code — you’ll have mental models:
- Agents are loops: The core pattern is simple. Complexity comes from handling edge cases well.
- Tools are the interface: Good tool design is the difference between helpful and dangerous.
- Context is the constraint: Everything in agent design traces back to managing finite context.
- Observability isn’t optional: If you can’t see what happened, you can’t fix what went wrong.
- Extensibility requires architecture: Ad-hoc additions become unmaintainable fast.
- Safety and capability trade off: More power means more risk. Design accordingly.
Course Structure
| Part | Title | Chapters | What You Build |
|---|---|---|---|
| Part 1 | Setup | 01–03 | Environment, bootstrap agent, project scaffolding |
| Part 2 | Core | 04–07 | Tools, cost tracking, session logging, research |
| Part 3 | Advanced | 08–11 | TUI upgrades, subagents, skills, polish |
What You Need
- A terminal (free with your computer)
- An OpenRouter API key (with credit card — a few dollars goes a long way)
- About a weekend of focused time
- Zero coding experience required
The model we’ll use: google/gemini-3-pro-preview (or anthropic/claude-opus-4.5 if you prefer).
Prerequisites
This course is beginner-friendly. You don’t need to:
- Know TypeScript (though we’ll use it)
- Have built an API client before
- Understand LLM theory
You DO need to:
- Be comfortable in a terminal
- Type
buncommands - Copy and paste prompts
Let’s Go
01 chapters
Chapters 01–03: Get your development environment running, build the smallest possible coding agent, and scaffold a proper project.
Chapters 04–07: Add code navigation tools, cost tracking, session logging, and web research capabilities.
Chapters 08–11: Upgrade the UI, add subagents and skills, and polish everything with introspection and compaction.
02 setup
Install mise, bun, and get your OpenRouter API key. The foundation for everything.
Run the smallest possible coding agent — a 50-line script that captures the core of what an agent is.
Convert the bootstrap script into a structured TypeScript project with tests, linting, and modular code.
03 core
Add structured tools for navigating, editing, and verifying code. Move beyond raw bash to safe, predictable operations.
Track model pricing, usage, and context window. Display cost after every turn so you never blow your API budget.
Record every message in JSONL format for debugging, analysis, and future compaction. Add slash commands for session control.
Add URL downloading (HTML to markdown) and web search via Tavily API. Create a specialized research agent.
04 advanced
Improve the terminal interface with model info on startup, status display, collapsible tool blocks, and better input handling.
Add the subagent system — specialized agents with different prompts and tools. Build a code-map generator and integrate the prompt system.
Build an extensible skill system where capabilities are loaded on-demand from SKILL.md files. Add image and video generation.
Clean up tool implementations, add session compaction, self-analysis, and document known issues.