Chapter 14

The Agent System Topology

The Complete Architecture of Agentic Systems

June 7, 202614 min read

The Complete Architecture of Agentic Systems

Part 14 of the AOD series

The most important topology decision in computing history happened in 1964. Paul Baran at RAND published a paper that asked one question. How do you build a communication network that survives a nuclear strike?

The era's default topology was hub-and-spoke. AT&T's switching system routed every call through centralized exchanges, which made it fast and efficient and impossibly fragile. A single bomb on the right exchange would take out continental communications. The constraint Baran was solving for did not allow that answer.

Baran proposed a different shape. He sketched a mesh of distributed nodes routing packets independently with no central authority, designed so the network would degrade gracefully under attack instead of collapsing all at once. Five years later, DARPA built ARPANET on top of his design. The funding pitch was resource sharing among researchers, but the topology underneath was Baran's survivable mesh.

Different constraint. Different topology.

Topology is the shape of what connects to what, and on what authority. ARPANET's mesh answered for survivability. Hub-and-spoke answered for efficiency. The constraint chose the shape, not the other way around.

Two shapes, two constraints

I tell you this story because every agent system I review now is making the same kind of decision, usually without knowing it. Most teams default to the centralized topology because it is the easiest shape to draw on a whiteboard. They put one agent in the middle, hang tools off it, and feed tickets in expecting code to come out the other side. That shape works for some constraints and breaks down for others, and the difference matters more than the prompts inside any of the agents.

This chapter, the opening of Part III, shows the complete shape an agent system forms when you zoom out. Part I gave you the mental model of encapsulation, abstraction, inheritance, and polymorphism. Part II gave you the development methodology that turns those principles into reliable agents. Part III shows the architecture those layers form.

The architecture of an agent system is not decoration. It carries weight. When that weight changes, the architecture changes with it. That is what Part III is about.

The Topology

When you build single agents, you spend your time thinking about prompt design and context windows. You worry about token limits. You worry about instruction adherence. When you build agentic systems, you think about layers.

Here is the picture I now draw on the whiteboard whenever someone asks what an agent system actually looks like.

The agent system topology, four layers

That diagram is what most production agent systems actually look like in early 2026. It is hub-and-spoke at the agent layer because most teams have not yet adopted peer coordination between specialist agents. I am drawing the most common shape on purpose. Section 5 shows where this evolves and what forces the evolution.

Four layers. Top to bottom.

The user layer is where human intent enters the system. Approval lives here. Final judgment on outcomes lives here. The user is never inside the agent stack. The user is the anchor at the top of it. If the user is missing, you do not have an agent system. You have an autonomous script running loose.

The orchestrator layer receives that intent, decides which agent should handle it, and coordinates across multiple agents when the task spans more than one domain. The orchestration patterns from Chapter 8 live here in the architecture. The orchestrator is not the smartest node in the system. It is the most trafficked.

The agent layer holds the persistent specialists that carry context across tasks. The Governance Triad introduced in Chapter 2 and formalized in Chapter 8 lives here (PM Agent, Architect Agent, Team Lead Agent). So do domain specialists like a Security Agent or a Frontend Agent. These are the service objects in your system. They earn their context over time and their value compounds.

The capability layer is where execution happens. It contains four distinct kinds of capability that an agent can reach for. Subagents are ephemeral execution, spawned for a single scoped task and then discarded. Skills inject knowledge or judgment without being independent actors. MCP servers connect the agent to external systems. Tools provide direct execution against well-defined functions. These are the methods and collaborators each agent reaches for to actually do work.

That is the picture. The topology is the class diagram I draw for agentic development. It is the artifact that survives translation between teams, between engineers and architects, between engineers and the humans who have to govern the result. Without it, every conversation about an agent system starts from scratch.

One callback before moving on. Each layer is a context boundary in the sense Chapter 2 defined it. Each layer manages its own scope, its own memory, its own decisions about what to keep and what to discard. The knowledge layers from Chapter 11 ride along these same boundaries. Project Knowledge (the durable repository-anchored memory built around a single codebase) sits at the agent layer. Practitioner Knowledge (the cross-project memory that travels with the human practitioner across teams and tools) crosses agent boundaries through the capability layer. The boundaries are not decorative. They are doing real work.

Layer Responsibilities

The topology is useful because each layer owns a clean responsibility. When I review an agent system design, the first thing I check is whether the layers have stayed honest about ownership. When responsibilities blur across layers, the system gets fragile in ways that are hard to diagnose. When they stay clean, the system stays composable.

The OOP parallel makes the structure obvious to anyone who has read Parts I and II.

Part I taught you class design. How a single agent is built, how its context is encapsulated, how it inherits configuration, how it specializes.
Part II taught you the development methodology. How you actually work with these agents day to day across the ADLC.
Part III teaches you system architecture. How agents compose into a running system that does real work.

Each layer's narrow ownership is what lets the layer below it be composed freely. That is the same property that made OOP work.

A user types "deploy the new authentication service" into the interface. The orchestrator receives the command. The orchestrator does not know how to deploy a service. It does not have the AWS credentials. It does not have the Terraform scripts. The orchestrator's routing decision is polymorphic dispatch (one interface, many specialists, the system picks the right one based on the request). It identifies the intent as an infrastructure task and routes it to the DevOps Agent. If the user had typed "design a new authentication schema," the same orchestrator would have routed the request to the Architect Agent instead. The topology makes that dispatch explicit instead of leaving it implicit in whatever single agent everyone has been overloading.

When an agent tries to own execution mechanics, it fails. A PM Agent should not be figuring out the exact JSON payload to update a Jira ticket. It should not be wrestling with API rate limits or auth tokens. It should rely on a Jira tool in the capability layer. Treat the agent as the brain. Treat the capability layer as the hands.

Agent vs. Subagent in the Topology

Chapter 8 introduced the distinction between persistent agents and ephemeral subagents. The topology is where that distinction becomes architecturally visible.

Persistent agents live at the agent layer. They carry context across tasks. A persistent Architect Agent remembers that you prefer functional components over class components. It remembers that your database requires specific indexing patterns because you had a production outage last month. A Security Agent that sees the seventeenth iteration of the same kind of vulnerability draws on the prior sixteen. An ephemeral one cannot. Persistence is not overhead. It is compounding judgment.

Ephemeral subagents live at the capability layer. Think of a code reviewer subagent running on a pull request. The orchestrator spins it up. It clones the repo. It runs a linter. It summarizes the diff into a clean report. It hands that report back. Then it vanishes. It is stateless by design, and the statelessness is a feature, not a limitation. Isolation is what makes it safe to compose.

The decision rule is simpler than the architecture suggests. If the node needs to remember what happened last time, it is an agent. If the node can treat every invocation as the first time, it is a subagent.

The topology enforces that decision visually. Putting a subagent at the agent layer wastes context. The node does not need persistence and you are paying the cost of carrying it anyway. Putting an agent at the capability layer destroys its accumulated judgment. You spin it up fresh every time and lose everything it learned. The layer separation is what keeps the persistence decision honest.

One thing the topology does not enforce on its own. Runtime depth. The four layers describe the logical boundaries an agent system has. The runtime you are using may not expose all four as separate processes. Claude Code is the working example most readers will recognize. The main CLI session is where the orchestrator and the agent-layer specialists both run. Named subagents like PM Agent or Architect Agent sit at the agent layer, but they do not carry the spawning capability themselves by default. The recursive "subagent under an agent" branch in the diagram fires only when you grant that capability explicitly or when you compose runtimes, where a generalist orchestrator in one runtime spawns specialist agents in another and each of those spawns its own subagents. The topology is still the right map. It tells you which boundaries exist whether or not your runtime exposes them as separate processes. The boundaries you collapse for runtime convenience are the ones that get governed by accident. That is exactly the seam Book 2 picks up.

Before moving on, name the wires. The orchestrator-to-agent and agent-to-agent paths run over A2A (agent-to-agent), the protocol agents use to coordinate as peers. A2A is the horizontal counterpart to MCP, and Chapter 15 covers it in depth. The agent-to-tool path runs over MCP, which Chapter 12 covered as a practitioner skill. Agent-to-subagent uses the spawning and result-collection mechanics from Chapter 8. Skills inject knowledge or judgment wherever they are needed in the stack, which is why the diagram shows them at the capability layer but a higher-level agent can inherit them too. Chapter 14 names the wires. Chapter 15 strips them.

The Topology as Governance Map

Every layer boundary in this diagram is also a trust boundary.

That is not a metaphor I am stretching. It is the central reason the topology matters beyond the engineering elegance of clean separations. If you cannot see the boundaries in your system, you cannot govern them. If you can see them, you have a governance map.

Walk the boundaries with me, and name the threats that live at each.

The user-to-orchestrator boundary is where human intent becomes agent action. The dominant threat here is prompt injection, where adversarial input rides in disguised as legitimate intent and bends the orchestrator's interpretation of the request. Approval lives at this boundary in principle. In real systems, the failure mode is that approval is asserted in the diagram but never actually enforced at runtime. Human-in-the-loop gates and scoped consent are the mechanisms that turn the assertion into an enforcement. Without them, the orchestrator might hallucinate a mandate to rewrite your authentication system over the weekend.

The orchestrator-to-agent boundary is where delegation happens with scope. The threat here is instruction smuggling, where the orchestrator passes along context that the downstream agent treats as authoritative when it should not. The orchestrator grants the agent a bounded mandate, and the agent should operate inside that mandate and not outside it. If the Architect Agent is asked to design a schema, it should not have the authority to push code to production. The boundary enforces that limitation only if the boundary is actually enforced.

The agent-to-capability boundary is the busiest one. Every tool call crosses it. Every MCP invocation crosses it. Every subagent spawn crosses it. The threat surface here is the widest in the system. Tool poisoning, where a capability returns content crafted to manipulate the agent that called it. The confused deputy, where the agent invokes a tool with privileges the user never intended to grant. Context exfiltration through ephemeral subagents that surface sensitive context outside the trust domain that produced it. Skill-injection threats like knowledge poisoning of the skill corpus, retrieval injection through context-augmented skills, and transitive trust through bundled skill packs the agent inherited without inspecting.

Three governance primitives ride on top of every one of these boundaries. Identity answers who the agent is acting as when it crosses the boundary. Scoped permissions answer what privileges actually transfer across the crossing. Boundary enforcement answers whether the runtime can prove the boundary held. Take any one of those three away and the boundary goes from governable to aspirational.

The MCP security treatment from Chapter 12 operates at exactly one of these boundaries (agent to tool). The topology shows you that there are more boundaries than that one, and that none of them get governed for free.

This is the seam where Book 2 picks up. Securing Agentic Systems builds the governance architecture directly on top of the topology drawn here. When that book talks about identity, scope, runtime monitoring, and escalation, it is placing controls on the boundaries in this diagram. The topology tells you what your system is. The boundaries in the topology tell you what you have to govern.

Topology Maturity Progression

The topology you draw is not just a shape. It is a maturity stage. Like the development discipline progression that runs from vibe coding to agentic engineering, the agent topology has its own maturity arc, and the two progressions move in parallel, stage for stage.

Stage 1: Single Agent

One agent, one context, one set of tools. Appropriate for focused tasks within a single domain. A Claude Code session writing a feature. A Cursor session refactoring a module. A Windsurf session debugging a regression. The structural cousin of vibe coding, with minimal coordination overhead and no governance separation. Most early agent demonstrations live here. Sufficient for individual-developer or single-domain workflows.

Stage 2: Hub-and-Spoke (where most production teams sit in early 2026)

The orchestrator routes to specialist agents. It is the load-bearing node, the one place routing, governance, audit, and escalation all converge. This is the AT&T-1964 of agent topologies, fast and efficient and governance-friendly and structurally fragile in exactly the way Baran named. Single point of failure if the orchestrator is overwhelmed, compromised, or unavailable. Necessary stage before mesh because you cannot govern what you cannot route through. This is also the diagram in Section 1.

Stage 3: A2A Mesh + Orchestrator-as-Peer (the emerging shape)

Agents coordinate as peers over A2A. The orchestrator stays as a distinct node. It still owns governance, audit, escalation, and approval gates, but it stops being the bottleneck for routine peer coordination. Two specialist agents can hand work to each other directly without round-tripping through the orchestrator. The orchestrator participates in the mesh as a peer for governance-relevant flows and stays out of the mesh for routine work. This is where the industry is heading in late 2025 and 2026. It answers for resilience and parallel work without giving up the governance separation enterprises require. Most teams adopting this stage are running it as an extension of Stage 2, keeping the orchestrator-led control plane and adding the peer mesh as a coordination plane. This is also where the "graceful degradation" property Baran was solving for shows up in agent systems. When the orchestrator is busy, the mesh keeps moving.

Stage 4 splits into two flavors. When the constraint shifts from "we own the system" to "no single party owns the system," the orchestrator-as-distinct-node model loses its single-anchor justification. Two distinct topologies have answered that constraint, and they sit at very different points on the maturity arc. One is real production reality at the cross-organization boundary. The other is a thought experiment about where the logic eventually leads.

Stage 4a: Federated Mesh (real, in production at the cross-org boundary)

Stage 4a creates a new boundary the single-org diagram does not have: the cross-organizational trust boundary. Signed Agent Cards anchor identity at that boundary. They do not anchor scope or audit. Cross-org permissions, cross-org audit attestation, and revocation lag are the threats that live here, and Book 2 builds the controls for them.

Each participating organization keeps its own internal orchestrator (Stages 2 or 3 internally), but agents from different organizations coordinate as peers across organizational boundaries. The Linux Foundation's Agentic AI Foundation governs the A2A protocol that makes this work. As of April 2026, more than 150 organizations are in production deployment, with native A2A support inside Azure AI Foundry, Amazon Bedrock AgentCore, Salesforce Agentforce, ServiceNow, and SAP. Vertical adoption is concentrated in supply chain, financial services, insurance, and IT operations. The defensibility primitive is Signed Agent Cards, cryptographic signatures that let a receiving agent verify the sending agent's claimed organizational identity. The orchestrator does not vanish in Stage 4a. It just stops being the only orchestrator in the picture.

Stage 4b: Pure Orchestratorless Mesh (hypothetical thought experiment)

Take the Stage 4a logic one full step further. What if even the org-local orchestrator disappears? Any agent can route to any other agent. Governance becomes a runtime property of the mesh itself, where every agent carries verifiable identity, scoped permissions get enforced at every boundary by the runtime, and audit logs are mesh-native and verifiable by any participant.

I have not seen this topology in production. Mycelium is the canonical attempt and self-describes as "v0.1 Alpha." Their public counters showed zero registered agents and zero coalitions formed when I checked. I include Stage 4b because the logic of the maturity progression points toward it as the limit case. The point is not prediction. The point is the diagnostic question. If governance itself had to live in the mesh rather than above it, what would the topology require? Stage 4b is what the answer looks like when followed all the way through. Whether the industry actually reaches it is a separate question.

The four stages map cleanly onto the AOD development discipline progression.

Match the topology to the constraint, not the maturity hype. A single Claude Code session writing a feature does not need an orchestrator. A cross-org agent ecosystem cannot operate inside Stage 2 without forcing one organization to become the central authority. The over-engineering anti-pattern is climbing the maturity ladder before the constraint demands it. The under-engineering anti-pattern is staying at Stage 2 when the work has already outgrown it (when one orchestrator has become a queue of bottlenecked work that should be peer-coordinated).

Before climbing to the next stage, ask three questions. (1) Does the work cross domains, organizations, or trust domains? (2) Is governance separation a hard requirement, or is it something the runtime can enforce inside the current stage? (3) Is the current stage's bottleneck node provably overloaded? Two yeses justify climbing. One yes is a warning. Zero yeses means stay where you are.

ARPANET teaches the discipline. They started with four nodes (UCLA, SRI, UC Santa Barbara, Utah) and proved the topology before scaling toward continental coverage. The shape was right from the beginning. The deployment was disciplined.

After Roberts and BBN, December 1969, public domain. Hand-drawn sketch showing the four original network gateway computers (UCLA, SRI, UCSB, Utah) and their connected host computers (Sigma 7, 940, 360, PDP-10).

Baran's mesh was not deployed continentally on day one. The same caution applies to Stage 3 today, to the Stage 4a federation work the Linux Foundation is governing now, and to the Stage 4b limit case whenever the industry decides to ask it seriously.

What's Next

You now have the picture. A class diagram for agentic development. User, orchestrator, agent, capability. Four layers, clear responsibilities, explicit boundaries that double as a governance map. And a maturity arc that tells you which stage your system sits at today and what would force it to grow.

Chapter 15 shows how the layers actually talk to each other. Two protocols carry the load. MCP runs vertically (the agent reaching down to its tools). A2A runs horizontally (the agent reaching across to peer agents). They are complementary, not competitive. The protocol stack is what turns the topology from a diagram into a running system, and what makes the Stage 3 mesh possible.

Without a shared shape, every team rebuilds the diagram from scratch and wonders why their designs never transfer. You now have a reusable shape, and a maturity arc that tells you where to grow it next.

The class. The lifecycle. The architecture. You have all three.