Repurposing Content with AI Agents
A hands-on tutorial teaching patterns for working with AI coding assistants — research, delegation, parallelism, and semantic enrichment. The email pipeline is the example; the patterns are the point.
TUTOR WITH THEFOCUS.AI
Copy this prompt into Claude, ChatGPT, or any external AI assistant. It points the assistant to the course instructions and links it to your student profile to track your progress and customize observations.
You are not enrolled yet. Enroll to generate a Student ID to track lesson completions and store learning notes.
Repurposing Content with AI Agents
This course teaches you how to build a content processing pipeline with AI agents. You’ll convert 218 email newsletters into structured, searchable markdown with rich metadata. But the interesting part isn’t what you build — it’s how you build it.
Instead of writing scripts from scratch, you teach AI agents to research solutions, write other agents, and process data at scale. The result is a pipeline that extracts content, downloads images, and applies rich taxonomy metadata — built through conversation rather than upfront specification.
The Four Phases
Phase 1: Setting Up
Install the tools and create a research agent that can learn about technologies for you.
The key insight: instead of manually writing agent instructions, you describe what you want and let the AI write the agent definition. This “prompts to build prompts” pattern means you work at a higher level of abstraction.
Phase 2: Cleaning and Organizing the Data
Ask the research agent how to extract EML files to markdown. It evaluates options, considers your environment, and documents its findings. Then build the converter based on that research. The agent discovers that plain text email content is already markdown-formatted — a simplification you wouldn’t have found by jumping straight to code.
Phase 3: Decorating the Data with Subagents
Analyze the converted content to build a taxonomy schema. Create a classification agent that applies that schema to each file. Run it in parallel batches: “classify the next 50 emails” spawns 50 concurrent agents. What would take hours manually completes in minutes.
Phase 4: Cleanup and Validation
Validation scripts catch inconsistencies across 200+ files. Fix edge cases, refine the schema, iterate until the pipeline produces clean output.
The Patterns (Not Just the Pipeline)
This course is about patterns that apply far beyond email processing:
| Pattern | What You Learn | Real-World Application |
|---|---|---|
| Context-Aware Research | Ask the LLM how to solve problems within your project. It sees your existing stack and gives tailored recommendations. | ”How should we setup Python?” becomes “extend your existing mise.toml” rather than “install pyenv.” |
| Agents Writing Agents | Describe what you want in plain language, let AI write the agent instructions. You stay at the “what” level while it handles the “how.” | Build specialized subagents without writing prompt engineering from scratch. |
| Compounding Knowledge | Research agents inform specialized agents. A tech-research-advisor produces reports that become context for building an email-taxonomy-classifier. | Each layer builds on the last — cumulative intelligence. |
| Parallel Execution | Process batches with concurrent agents. Classifying 50 emails spawns 50 parallel workers. | Scale agent workloads linearly by adding more concurrent workers. |
| Semantic Enrichment | LLMs extract meaning that would be tedious manually: topics, mentioned people, companies, AI models, audience level, content depth. | Any unstructured content becomes queryable and cross-linked. |
What You’ll End Up With
content/
├── uncategorized/ # 218 markdown files (raw extraction)
├── decorated/ # 210 classified files (with taxonomy)
└── images/ # 710 downloaded images
reports/ # AI research findings
.claude/agents/ # Custom agent definitions
magazine/ # Generated magazine website
Each decorated file has rich frontmatter:
---
title: "Golden Age for Indie Devs and Engineers"
date: 2024-08-26
series: fod
episode: 64
content_type: digest
primary_topic: industry
tags: [indie-dev, ai-tools, entrepreneurship]
people_mentioned:
- name: "Andrej Karpathy"
role: "AI Researcher"
companies_mentioned: [OpenAI, Anthropic]
audience: technical
depth: overview
---
Start Here
- About This Course — Prerequisites, tools, and what you need to get started
- Phase 1: Research & Setup — Install tools, create your first research agent
- Phase 2: Extraction — Convert EML emails to clean markdown
- Phase 3: Classification — Build a taxonomy and classify at scale
- Phase 4: Validation — Scripts, edge cases, and iteration
- Project Ideas — Ways to apply these patterns to your own work
01 chapters
Install the tools and create a research agent that can learn about technologies for you.
Use the research agent to decide how to extract EML files, then build the converter and download images.
Analyze converted content to build a taxonomy schema, create a classification agent, and run parallel classification at scale.
Validation scripts catch inconsistencies across 200+ files. Fix edge cases, refine the schema, and iterate.
02 research
Install mise, Python 3.12, uv, and Claude Code. Create a git repository and configure your environment at the right level of abstraction.
Teach the AI agent to research problems within your specific project context, producing recommendations tailored to your stack.
The core pattern — describe agent behavior in plain language and let Claude write the agent definition. Compose knowledge through project memory.
03 extraction
Use the research agent to evaluate extraction approaches, build the converter, and discover that plain text emails are already markdown.
Download images from emails, replace remote URLs with local paths, and handle edge cases like tracking pixels.
04 classification
Analyze all content files to identify patterns, series, and topics. Design a comprehensive frontmatter schema for classification.
Create a classification agent and run it against batches of files. 50 concurrent subagents complete in minutes what would take hours manually.
LLMs extract people, companies, models, audience level, and content depth — metadata that would be tedious to create manually.
05 validation
Build automated validation to catch inconsistencies across 200+ files. Handle malformed YAML, mismatched content, and missing files.
Refine the schema, re-run classification, and apply the patterns to build a magazine website from structured content.