Multi-Agent Architecture Patterns
Multi-agent systems are how production AI workflows handle tasks too complex for a single prompt. This lesson covers the core orchestration patterns, when to use them, and real-world examples from coding agents, research workflows, and enterprise automation.
Single Agent vs Multi-Agent: When Do You Need More Than One?
A single agent is an LLM with a system prompt, a set of tools, and a loop that reasons about which tool to call next. This works well for straightforward tasks: answer a question, write a function, summarize a document.
Multi-agent systems become necessary when:
- The task requires different expertise: A coding task needs one agent to write code, another to review it, and a third to write tests. Each agent has different system prompts, tools, and evaluation criteria.
- The task has natural parallelism: Research across 10 sources can be done by 10 agents simultaneously, with a final agent synthesizing results.
- Quality requires adversarial checking: A "writer" agent drafts content while a "critic" agent evaluates it. This debate pattern catches errors a single agent would miss.
- The workflow has distinct phases: Planning, execution, and verification are fundamentally different tasks that benefit from specialized agents.
The Four Core Orchestration Patterns
Every multi-agent system in production uses one of these four patterns (or a combination):
1. Sequential (Pipeline)
Agents execute one after another. The output of Agent A becomes the input of Agent B. This is the simplest multi-agent pattern.
# Sequential pipeline: Planner -> Coder -> Reviewer -> Tester
class SequentialPipeline:
def __init__(self):
self.agents = [
PlannerAgent(), # Breaks task into implementation steps
CoderAgent(), # Writes code based on the plan
ReviewerAgent(), # Reviews code for bugs, style, security
TesterAgent(), # Writes and runs tests
]
async def run(self, task: str) -> dict:
context = {"original_task": task}
for agent in self.agents:
result = await agent.execute(context)
context[agent.name] = result # Each agent sees all prior results
if result.get("status") == "blocked":
return {"status": "blocked", "blocked_at": agent.name, "reason": result["reason"]}
return context
When to use: Tasks with clear phases (plan → implement → review → test). Code generation workflows, content pipelines, data processing chains.
2. Parallel (Fan-Out / Fan-In)
Multiple agents work on independent sub-tasks simultaneously. A coordinator agent splits the work and a synthesizer agent combines the results.
# Parallel fan-out: Research multiple sources simultaneously
import asyncio
class ParallelResearch:
def __init__(self, num_researchers: int = 5):
self.splitter = TaskSplitterAgent()
self.researchers = [ResearchAgent(id=i) for i in range(num_researchers)]
self.synthesizer = SynthesizerAgent()
async def run(self, question: str) -> str:
# Fan-out: split into sub-questions
sub_questions = await self.splitter.split(question)
# Execute in parallel
tasks = [
self.researchers[i % len(self.researchers)].research(sq)
for i, sq in enumerate(sub_questions)
]
results = await asyncio.gather(*tasks, return_exceptions=True)
# Filter out failures
valid_results = [r for r in results if not isinstance(r, Exception)]
# Fan-in: synthesize
return await self.synthesizer.combine(question, valid_results)
When to use: Research tasks, data gathering from multiple sources, batch processing where sub-tasks are independent.
3. Hierarchical (Supervisor)
A supervisor agent delegates tasks to specialist agents, monitors progress, and decides what to do next based on results. This is the most common pattern in production systems like coding agents.
# Hierarchical: Supervisor delegates to specialists
class SupervisorSystem:
def __init__(self):
self.supervisor = SupervisorAgent()
self.specialists = {
"coder": CoderAgent(),
"researcher": ResearchAgent(),
"debugger": DebuggerAgent(),
"reviewer": ReviewerAgent(),
}
async def run(self, task: str) -> dict:
plan = await self.supervisor.create_plan(task)
for step in plan.steps:
agent = self.specialists[step.agent_type]
result = await agent.execute(step.instruction, step.context)
# Supervisor decides next action based on result
decision = await self.supervisor.evaluate(step, result)
if decision.action == "retry":
result = await agent.execute(decision.revised_instruction, step.context)
elif decision.action == "escalate":
result = await self.specialists[decision.escalate_to].execute(
decision.escalation_context, step.context
)
elif decision.action == "complete":
continue
return await self.supervisor.compile_final_result()
When to use: Complex tasks where the workflow cannot be predetermined. The supervisor adapts the plan based on intermediate results. Used by most coding agents (Devin, Claude Code, Cursor Agent).
4. Debate (Adversarial)
Two or more agents argue for different solutions. A judge agent evaluates the arguments and picks the best answer or synthesizes a consensus.
# Debate pattern: multiple agents argue, judge decides
class DebateSystem:
def __init__(self, num_debaters: int = 3, max_rounds: int = 3):
self.debaters = [DebaterAgent(id=i) for i in range(num_debaters)]
self.judge = JudgeAgent()
self.max_rounds = max_rounds
async def run(self, question: str) -> dict:
# Round 1: independent answers
positions = await asyncio.gather(*[
d.initial_position(question) for d in self.debaters
])
for round_num in range(self.max_rounds):
# Each debater sees others' positions and can revise
revised = await asyncio.gather(*[
self.debaters[i].revise(question, positions, i)
for i in range(len(self.debaters))
])
# Check for convergence
if await self.judge.has_consensus(revised):
break
positions = revised
# Judge renders final verdict
return await self.judge.verdict(question, positions)
When to use: High-stakes decisions where accuracy matters more than speed. Code review, legal document analysis, fact-checking, architectural decisions.
Choosing the Right Pattern
| Pattern | Latency | Accuracy | Cost | Best For |
|---|---|---|---|---|
| Sequential | High (serial) | Good | Moderate | Clear phase workflows (plan → build → test) |
| Parallel | Low (concurrent) | Good | High (many calls) | Research, data gathering, independent sub-tasks |
| Hierarchical | Variable | High | Variable | Complex tasks needing adaptive planning |
| Debate | Very high | Very high | Very high | High-stakes decisions, critical code review |
Real-World Multi-Agent Systems
Coding Agents (Claude Code, Devin)
Hierarchical pattern with a supervisor that manages planning, file editing, terminal commands, and code review agents. The supervisor maintains a task list and delegates based on the current state of the codebase.
Research Workflows (Deep Research)
Parallel fan-out pattern. A planner breaks a research question into sub-queries, parallel agents search different sources, and a synthesizer combines findings with citations and confidence scores.
Customer Support Triage
Sequential pipeline: classifier agent routes the ticket, retrieval agent fetches relevant docs, response agent drafts the reply, quality agent checks for accuracy and tone before sending.
Data Pipeline Automation
Hierarchical with parallel sub-tasks. A coordinator agent manages ETL jobs, delegates data validation to specialist agents, runs quality checks in parallel, and escalates anomalies to human operators.
Key Takeaways
- Multi-agent systems are for tasks that require multiple distinct capabilities, natural parallelism, or adversarial quality checking.
- Four core patterns: sequential (pipeline), parallel (fan-out/fan-in), hierarchical (supervisor), and debate (adversarial).
- Start with a single agent. Only split into multi-agent when you hit clear limitations in capability or quality.
- Most production systems combine patterns — hierarchical overall with sequential or parallel sub-workflows.
- The pattern choice depends on your latency, accuracy, and cost requirements.
What Is Next
In the next lesson, we will design individual agents — the building blocks of any multi-agent system. You will learn the ReAct pattern, how to structure agent memory and tools, and how to build a reusable agent framework in Python.
Lilly Tech Systems