June 30, 20265 MIN READAll writing

Agent Orchestration Is an Engineering Discipline

Why most agent systems don't ship, and what the AOS Harness does about it. Council & Crew, Issue 1.

I've watched a lot of agent systems die between the demo and the deploy. The patterns repeat. A reasoning chain that handled three tools beautifully chokes on six. A multi-agent setup that produced a polished plan in the notebook produces incoherent garbage when scaled past one user. A council of LLMs that deliberated impressively in the slides hands the production team an output nobody can act on.

The post-mortems all sound the same, too. "We needed better prompts." "The model wasn't smart enough." "We should have used Claude or GPT or Gemini instead." Sometimes those are true. Mostly they're not. The thing that actually broke is rarely the model. It's the space between the models.

Configuration is not architecture

Agent orchestration, the question of how multiple AI components compose into something that produces useful work, has been treated as configuration. A YAML file. A prompt template. A system message with seven roles and a few hopeful sentences about "working together."

That framing is breaking down. Production agent systems have to handle disagreement, hand-off, evidence standards, observability, cost attribution, failure modes specific to each model vendor, and a dozen other things that look a lot like the problems software engineers have been solving in distributed systems for thirty years. Calling these problems "configuration" is what got us into the demo-to-production graveyard.

The argument I want to make across the next twelve issues is that agent orchestration is becoming an engineering discipline. Not a prompt-tuning hobby. Not an AI research curiosity. An engineering discipline, with primitives, patterns, anti-patterns, observability requirements, and trade-offs that experienced engineers can reason about the same way they reason about service meshes and message queues.

Council and Crew

When you look at agent systems that actually ship, two patterns emerge underneath all the noise. I call them the Council and the Crew.

The Council is for decisions. You have a hard question, ambiguous evidence, multiple stakeholders, and no obvious right answer. A Council is a group of agents with different biases, temperaments, and evidence standards who deliberate. A neutral Arbiter frames the question. Specialists argue from opposing perspectives. A Provocateur stress-tests whatever consensus emerges. The output is a structured memo with ranked recommendations and documented dissent, not a single confident answer.

The Crew is for delivery. You have a brief, you need an artifact, and the path from one to the other has knowable stages. A Crew is a sequence of agents producing artifacts in order: requirements, architecture, planning, tasks, security review, assembly. A CTO Orchestrator drives the lifecycle. The output is a buildable execution package that an engineering team, or another agent crew, can act on.

Most real systems need both. You decide what to build with a Council, you build it with a Crew, and you cycle when the Crew hits something the Council didn't anticipate. The systems that fail in production are usually the ones that picked one pattern and forced it to do the other's job.

The missing primitive: profile

The thing that makes Council and Crew composable, rather than just descriptive, is the profile. A profile is a coherent assembly of agents (with their biases and temperaments), a workflow pattern, evidence standards, and the interface contracts between them. The Strategic Council profile is different from the Security Review profile not because the YAML keys are different, but because the agents in it bring different evidence standards and the synthesis is held to a different bar.

Most agent frameworks don't have profiles as a first-class primitive. They have agents and they have orchestration code. The result is that every new use case starts from scratch, every team builds its own private vocabulary, and nobody can reuse anyone else's work. That isn't an engineering discipline. It's a cottage industry.

What this newsletter will do

Council & Crew walks through the AOS Harness in public, every other week, for the next six months. The harness is a config-driven framework I've been building that makes Councils and Crews composable on top of Claude Code, Codex, Gemini, and Pi. It's open source. This newsletter is the field manual.

The arc is roughly: foundations first (issues 1 through 4), then the patterns in depth (issues 5 through 8), then domain-specific applications (issues 9 through 12). By the end, anyone reading from Issue 1 should be able to author their own profile for their own domain and contribute it back.

Each written issue is paired with a NotebookLM audio overview that drops in the off week. Listen on the commute, read on the desk. Both work as standalone artifacts.

Next time

Issue 2 is a tour of the four primitives that make the AOS Harness work: agents, profiles, skills, and domains. What each one is, what problem it solves, and how they compose into something you can ship.

If you're building agent systems and you've felt the gap between what the demo promised and what the production system delivered, this newsletter is for you. Subscribe and we'll meet here every other Tuesday.

agent-orchestrationai-engineering