remotion_service/docs/superpowers/specs/2026-03-22-agent-hierarchy-design.md

# Agent Team Hierarchical Dispatch — Design Spec

**Date:** 2026-03-22
**Status:** Draft
**Scope:** Agent team architecture redesign — from hub-and-spoke to layered hierarchy with universal dispatch

## Problem

The current agent system uses a hub-and-spoke topology: the orchestrator dispatches individual specialists, agents cannot call each other, and inter-agent communication goes through a manual handoff mechanism processed by the main session. This creates:

- **Passive agents** — they produce output and request handoffs but can't act autonomously
- **2-hop overhead** — every inter-agent interaction requires: request → main session dispatch → result → continuation re-invocation
- **Main session bottleneck** — the main session manually processes every handoff, tracking chains, depth, and history
- **No lateral collaboration** — specialists can't consult each other directly

## Goals

- **Autonomy** — agents can pull in experts when they need them without waiting for orchestration
- **Quality** — richer cross-domain collaboration produces better outputs
- **Full guardrails** — cost control, observability, loop prevention, quality of inter-agent communication

## Non-Goals

- Hard token budget enforcement (Claude Code doesn't expose token counts to agents)
- Structural enforcement of hierarchy via tool access (protocol-enforced instead)
- Changing what individual specialists do (only how they communicate)

## Design

### Team Hierarchy

20 agents (16 existing + 4 new: Architecture Lead, Quality Lead, Senior Backend Engineer, Senior Frontend Engineer). 4 tiers. 3 sub-teams + 2 staff.

One existing agent is promoted to a lead role (Product Strategist → Product Lead). Four new agents are created: Architecture Lead, Quality Lead, Senior Backend Engineer, Senior Frontend Engineer. Backend Architect remains unchanged as a Tier 2 specialist.

```
Tier 0: ORCHESTRATOR (Tech Lead)
         │
         ├── Tier 1: ARCHITECTURE LEAD (NEW — pure system architect + coordinator)
         │    ├── Tier 2: Backend Architect (unchanged)
         │    ├── Tier 2: Frontend Architect
         │    ├── Tier 2: DB Architect
         │    ├── Tier 2: Remotion Engineer
         │    ├── Tier 2: Senior Backend Engineer (NEW — implements backend code)
         │    └── Tier 2: Senior Frontend Engineer (NEW — implements frontend code)
         │
         ├── Tier 1: QUALITY LEAD (NEW — QA strategy + coordinator)
         │    ├── Tier 2: Frontend QA
         │    ├── Tier 2: Backend QA
         │    ├── Tier 2: Security Auditor
         │    ├── Tier 2: Design Auditor
         │    └── Tier 2: Performance Engineer
         │
         ├── Tier 1: PRODUCT LEAD (Product Strategist, promoted + renamed)
         │    ├── Tier 2: UI/UX Designer
         │    ├── Tier 2: Technical Writer
         │    └── Tier 2: ML/AI Engineer
         │
         ├── Tier 1: DevOps Engineer (staff, reports directly)
         └── Tier 1: Debug Specialist (staff, reports directly)
```

### Role Changes

**Architecture Lead (NEW).** Pure system architect and coordinator. Does NOT retain Backend Architect expertise — it operates at the system level, understanding how all services interact but not doing deep implementation in any one. Expertise in:
- **Cross-service architecture** — understanding the full data flow: Frontend → Backend API → Dramatiq → Remotion → S3 → WebSocket
- **API contract design** — defining the interfaces between services, ensuring contracts are consistent and complete
- **System decomposition** — breaking architectural tasks into specialist-scoped sub-tasks for Backend Architect, Frontend Architect, DB Architect, Remotion Engineer
- **Architectural trade-off analysis** — evaluating approaches across performance, maintainability, security, and developer experience
- **Frontend-last phasing** — sequencing backend/DB work before frontend work within its sub-team (see Frontend-Last Phasing section)

Always operates in coordinator mode. When the orchestrator needs deep backend expertise, it dispatches Backend Architect directly (through the Architecture Lead), not the Architecture Lead in "specialist mode."

**Backend Architect (UNCHANGED).** Remains a Tier 2 specialist under Architecture Lead. Retains all Python/FastAPI/service-layer expertise. No changes to its domain knowledge or responsibilities — only hierarchy context is added (lead: Architecture Lead, tier: 2). Focuses on architecture, design, and API contracts — does NOT write implementation code.

**Senior Backend Engineer (NEW).** Implementation specialist for the backend. Writes production Python/FastAPI code based on architectural specs from Backend Architect and DB Architect. Expertise in:
- **FastAPI implementation** — endpoints, dependency injection, Pydantic schemas, middleware
- **SQLAlchemy async** — models, repositories, migrations, complex queries
- **Dramatiq tasks** — background job implementation, error handling, retries
- **Service layer patterns** — business logic, cross-module coordination
- Receives architectural specs and API contracts as input, produces working code as output
- Follows the project's module pattern exactly: `models.py`, `schemas.py`, `repository.py`, `service.py`, `router.py`

**Senior Frontend Engineer (NEW).** Implementation specialist for the frontend. Writes production Next.js/React/TypeScript code based on architectural specs from Frontend Architect and design specs from UI/UX Designer. Expertise in:
- **Next.js 16 / React 19** — App Router, Server Components, client components, data fetching
- **FSD architecture** — strict layer boundaries, module-aware features, barrel exports
- **Component implementation** — Radix Themes, SCSS Modules, responsive layouts, animations
- **State management** — TanStack Query for server state, Redux for client state, form handling
- Receives component tree designs and interaction specs as input, produces working code as output
- Enforces FSD import rules: `pages → widgets → features → entities → shared`

**Product Strategist → Product Lead.** File renamed: `product-strategist.md` → `product-lead.md`. Memory directory renamed accordingly. Retains all product/growth expertise. Gains coordination of the "why and what" sub-team: UX design, documentation, ML model decisions. Operates in dual mode: coordinator mode (default) for sub-team orchestration, specialist mode when the orchestrator needs direct product/growth analysis. The orchestrator signals mode via `MODE: coordinator` or `MODE: specialist` in the dispatch context. If omitted, coordinator mode is assumed.

**Quality Lead (NEW).** Senior QA/verification coordinator. Always operates in coordinator mode. Expertise in:
- **Risk-based test strategy** — analyzing code changes to determine what kinds of testing are needed (unit, integration, E2E, security, performance) and at what coverage level
- **Quality synthesis** — combining outputs from multiple QA/audit agents into a unified quality assessment with prioritized findings
- **Test gap analysis** — identifying what isn't covered, what edge cases are missing, what failure modes haven't been considered

Decision framework for dispatching sub-team:
- Code changes touch auth, user input, or file handling → dispatch Security Auditor
- Code changes touch DB queries, schema, or data volume → dispatch Performance Engineer
- Code changes touch UI components or flows → dispatch Frontend QA + Design Auditor
- Code changes touch API endpoints or service boundaries → dispatch Backend QA
- When QA agents disagree (e.g., Security Auditor says a pattern is safe but Backend QA says it's untestable), the Quality Lead weighs risk severity vs. testability and makes the call, noting the trade-off in the audit trail

**DevOps & Debug Specialist** remain staff roles reporting directly to the Orchestrator. Cross-cutting agents that any sub-team might need. Leads and specialists may dispatch staff agents directly when the need is specific — staff agents are not gated behind the orchestrator.

### Lead Dual-Mode Operation

Only the Product Lead has dual-mode operation, since it is a promoted specialist (Product Strategist) that retains domain expertise alongside coordination responsibilities.

**Coordinator mode** (default): "Decompose this task for your sub-team, dispatch the right specialists, synthesize results." The lead acts as a manager — scoping, dispatching, synthesizing.

**Specialist mode** (explicit tag): "Answer this as a product/growth specialist — do NOT dispatch your sub-team." The lead acts as an individual contributor. Used when the orchestrator needs direct product analysis.

The orchestrator includes `MODE: coordinator` or `MODE: specialist` in the dispatch context. If omitted, coordinator mode is assumed.

Architecture Lead and Quality Lead are pure coordinators — they always operate in coordinator mode and have no specialist mode.

### What "Lead" Means

A lead agent in coordinator mode:

1. Receives a scoped sub-task from the orchestrator
2. Decomposes it for their sub-team
3. Dispatches specialists with packaged context
4. Synthesizes specialist outputs into a unified recommendation
5. Reports back to the orchestrator with synthesized result + audit trail

Leads do not micromanage. If a specialist dispatches another agent directly (cross-team), that's fine — the audit trail captures it.

### Dispatch Protocol

#### Dispatch Context Object

Every agent call carries a context object for chain tracking, loop prevention, and depth enforcement:

```
DISPATCH CONTEXT:
  origin_task: "Add caption style presets"
  call_chain: ["orchestrator", "architecture-lead"]
  current_depth: 2
  max_depth: 3
  initiating_agent: "orchestrator"
  reason: "Need cross-service architecture design for caption presets feature"
```

Depth counting: `current_depth` equals the length of `call_chain` — the number of agents already in the chain including the one being dispatched. The orchestrator starts the chain at depth 1 when dispatching a lead. The lead adds itself, making depth 2 when dispatching a specialist. The specialist adds itself, making depth 3 — which is the maximum and means no further dispatches.

```
Orchestrator dispatches Lead    → call_chain: [orchestrator]           → current_depth: 1
Lead dispatches Specialist      → call_chain: [orchestrator, lead]     → current_depth: 2
Specialist dispatches another   → call_chain: [orchestrator, lead, specialist] → current_depth: 3 (terminal)
```

The calling agent appends itself to `call_chain` and increments `current_depth` before dispatching. The called agent checks these before doing anything.

#### Dispatch Rules

**Rule 1: Preferred path first.** Follow the hierarchy by default: Orchestrator → Leads → Specialists → back to Lead.

**Rule 2: Direct calls allowed with justification.** Any agent can call any other agent when the question is narrow and specific, going through the hierarchy would add latency without value, and the caller knows exactly which specialist to ask. Cross-team direct calls MUST use consultation mode (not full dispatch) unless the caller's lead is not in the call chain.

**Rule 3: Never dispatch your own lead.** Specialists never dispatch upward to their lead. If lead-level coordination is needed, return results with a note.

**Rule 4: Never dispatch the orchestrator.** Information flows up through return values, not dispatch calls.

#### Two Dispatch Modes

**Full dispatch** — spawn the agent with a complete task. Used by leads dispatching specialists or for substantial cross-team work.

```
Agent(subagent_type="db-architect", prompt="
  DISPATCH CONTEXT:
    call_chain: [orchestrator, architecture-lead]
    current_depth: 2
    max_depth: 3
    ...

  TASK: Design the caption_presets table schema...
  CONTEXT: ...
  DELIVERABLE: Schema DDL + migration strategy
")
```

**Consultation** — a focused question expecting a short answer. Used for cross-team quick checks and all cross-team direct calls from specialists.

```
Agent(subagent_type="security-auditor", prompt="
  DISPATCH CONTEXT:
    call_chain: [orchestrator, architecture-lead, db-architect]
    current_depth: 3
    max_depth: 3
    ...

  CONSULTATION (not full task):
  Is storing user-selected caption font names as raw strings in PostgreSQL
  a security concern?

  SHORT ANSWER EXPECTED.
")
```

Both modes can be parallelized by making multiple `Agent` calls in a single response (e.g., a lead dispatching 3 specialists simultaneously).

#### Receiving a Dispatch

1. Read dispatch context — check `current_depth`, `call_chain`
2. If `current_depth >= max_depth` — do NOT dispatch further (this includes consultations — consultation mode is still a dispatch and subject to depth limits). Note: "Depth limit reached, could not consult [X]"
3. If target agent is already in `call_chain` — refuse: "Loop detected: [agent] already in call chain"
4. Read team protocol and own memory as usual
5. Execute the task, dispatching sub-agents if needed and depth allows
6. Return results with structured audit trail

**Context window note:** Agents at depth 2 should prefer consultation mode over full dispatch to preserve context budget. Agents at depth 3 cannot dispatch at all — they use Deferred Consultations for unresolved needs. Each dispatch spawns a new subprocess that loads the team protocol, agent file, memory, and dispatch context. Deep chains have diminishing returns as less context room is available for actual work.

### Guardrails

#### Guardrail 1: Depth Limit (Hard)

Max depth 3, counted as the length of `call_chain`:

| Depth | Who | Can dispatch? |
|-------|-----|---------------|
| 1 | Lead / Staff (dispatched by orchestrator) | Yes → Specialists, other leads, staff |
| 2 | Specialist dispatched by lead | Yes → one more agent (prefer consultation) |
| 3 | Agent dispatched by specialist | No — terminal. Return "Deferred Consultation" if more input needed |

The orchestrator itself is at depth 0 and always dispatches at depth 1. Staff agents (DevOps, Debug Specialist) follow the same depth rules as leads when dispatched by the orchestrator — they enter at depth 1 and can dispatch further at depth 2.

Deferred Consultation format for depth-limited agents:

```
## Deferred Consultations

### → Performance Engineer
**Question:** Will this join across 3 tables degrade at 100k+ rows?
**Context:** [schema details]
**Blocks:** My indexing recommendation
```

The lead picks this up and handles it in a follow-up dispatch.

#### Guardrail 2: Loop Prevention (Hard)

Before dispatching, check:

```
if target_agent in call_chain:
    DO NOT DISPATCH
    return "Loop detected: {target_agent} already in call chain {call_chain}"
```

Absolute — no exceptions. If Agent A called Agent B who needs Agent A's input, Agent B returns what it has and notes the dependency.

#### Guardrail 3: Cost Control (Soft, Protocol-Enforced)

- **Consultation over dispatch.** Single facts or opinions use consultation mode, not full dispatch.
- **Dispatch justification required.** Every dispatch includes a `reason`. "I might need their input" is not valid.
- **Lead budget awareness.** Leads prefer 2-3 specialists over dispatching their entire sub-team. Ask: "Can I answer part of this myself?"

#### Guardrail 4: Quality Gate (Protocol)

An agent should only dispatch when it can articulate:

1. **WHAT** — specific question or task for the target
2. **WHY** — why it can't answer this itself
3. **BACK** — what specific deliverable it needs returned

If all three aren't clear, don't dispatch.

#### Escalation Paths

| Situation | Action |
|-----------|--------|
| Depth limit reached | Return "Deferred Consultation" — lead handles it |
| Loop detected | Return partial results, note circular dependency |
| Agent outputs contradict | Return both perspectives — lead resolves or escalates to orchestrator |
| Task is wrong or out of scope | Return early with "Scope Challenge" (see format below) — lead decides |

Scope Challenge format:

```
## Scope Challenge

**Issue:** [What is wrong with the task as given]
**Why this matters:** [Impact if we proceed as-is]
**Recommendation:** [What should be done instead]
**Blocks:** [What work is paused pending resolution]
```

### Audit Trail

#### Output Format

Every agent that made dispatch calls includes:

```
## Calls Made

### 1. → DB Architect (full dispatch)
**Reason:** Need schema validation for caption_presets table
**Asked:** "Design the caption_presets table with these constraints..."
**Got back:** Schema DDL with 4 columns, recommended GIN index on style_config JSONB
**Used in:** My architecture recommendation, Section 2

### 2. → Security Auditor (consultation)
**Reason:** Unsure if storing font names as raw strings is injectable
**Asked:** "Is storing user-selected caption font names as raw strings a security concern?"
**Got back:** "No risk — parameterized queries prevent injection"
**Used in:** Confirmed approach is safe, no changes needed

### 3. → Frontend Architect (full dispatch, DEFERRED)
**Reason:** Depth limit reached (3/3), could not dispatch
**Needs:** Validate API response shape works for the component tree
**Blocks:** API contract recommendation
```

#### Field Semantics

- **Reason** — maps to Guardrail 4's "WHY"
- **Asked** — the actual question (1-2 sentences, not the full prompt)
- **Got back** — key takeaway from response (1-2 sentences)
- **Used in** — which part of the agent's recommendation was informed by this call
- **DEFERRED** — marks calls blocked by depth/loop limits

#### Recursive Bubbling

Audit trails are recursive. Leads include their own Calls Made. The orchestrator sees leads' calls. The main session sees the orchestrator's calls. Full tree is reconstructable:

```
Main Session
└── Orchestrator
    ├── Calls Made: → Architecture Lead, → Quality Lead (parallel)
    ├── Architecture Lead
    │   ├── Calls Made: → DB Architect, → Frontend Architect (parallel)
    │   │   ├── DB Architect → Calls Made: → Performance Engineer (consultation)
    │   │   └── Frontend Architect → Calls Made: (none)
    │   └── Result: unified architecture recommendation
    └── Quality Lead
        ├── Calls Made: → Backend QA, → Security Auditor (parallel)
        └── Result: unified quality assessment
```

When an agent makes no calls, it omits the "Calls Made" section entirely.

### Orchestrator Changes

The orchestrator shifts from micromanager to executive. It dispatches leads directly (using the Agent tool), collects their results, and synthesizes. The main session only talks to the orchestrator.

**Before:**
- Selects every individual specialist
- Returns a plan for the main session to execute
- Main session dispatches agents, processes handoffs, manages chains
- Re-invokes agents in continuation mode

**After:**
- Selects which leads to involve (architecture, quality, product)
- Dispatches leads + staff directly via Agent tool (not returning a plan for the main session)
- Collects lead-level results (already synthesized with audit trail)
- Resolves cross-team conflicts between leads
- Returns final synthesized recommendation to the main session

Specific changes:
1. "Select Agents" → "Select Leads" — pick concerns, not specialists
2. "Predict Handoffs" simplifies — cross-team dependencies between leads only
3. Handoff processing disappears — leads handle internally
4. Conflict resolution in two tiers: leads resolve intra-team, orchestrator resolves inter-team

### Frontend-Last Phasing

This rule moves from the orchestrator to the Architecture Lead's agent file, since the Architecture Lead now owns the decomposition of architectural work across frontend and backend.

The rule: when a task involves both frontend and backend work, the Architecture Lead must dispatch backend-affecting agents (DB Architect) before frontend-affecting agents (Frontend Architect). The Architecture Lead sequences this internally — the orchestrator does not need to manage phasing.

The team protocol retains a brief reference: "Architecture Lead enforces frontend-last phasing within its sub-team — see `architecture-lead.md`."

### Main Session Changes

The dispatch loop in CLAUDE.md simplifies dramatically. The main session no longer processes individual agent dispatches, handoffs, or chains.

**After:**
1. Dispatch orchestrator with task context
2. Orchestrator handles everything internally (dispatches leads, collects results, resolves conflicts)
3. Receive orchestrator's final synthesis (includes recursive audit trail)
4. Present results to user with team credit summary

The main session no longer: processes individual handoffs, tracks chain history, enforces depth limits, re-invokes agents in continuation mode, dispatches individual specialists, or manages phasing. All of that is now handled within the orchestrator → lead → specialist hierarchy.

## Implementation Scope

### Implementation Order

Execute in this sequence to ensure each layer is ready before the layer above uses it:

1. **Create new agents** — `.claude/agents/architecture-lead.md`, `.claude/agents/quality-lead.md`, `.claude/agents/senior-backend-engineer.md`, `.claude/agents/senior-frontend-engineer.md`
2. **Update team protocol** — add dispatch protocol, context object, guardrails, audit trail format, hierarchy definition to `.claude/agents-shared/team-protocol.md`
3. **Update specialist agents** — add hierarchy context (lead identity, tier, dispatch protocol reference) to all existing specialist agent files
4. **Rename and update Product Lead** — rename `product-strategist.md` → `product-lead.md`; add lead coordination responsibilities and dual-mode behavior
5. **Update staff agents** — add hierarchy context (staff role, direct-to-orchestrator) to DevOps and Debug Specialist
6. **Update orchestrator** — shift to lead-level dispatch, remove individual specialist routing, add dual-mode dispatch for Product Lead
7. **Update CLAUDE.md** — simplify dispatch loop to orchestrator-only model, update agent team description and count to 20, document architect-vs-engineer role split
8. **Rename memory directories** — `.claude/agents-memory/product-strategist/` → `.claude/agents-memory/product-lead/`
9. **Create memory directories** — `.claude/agents-memory/architecture-lead/`, `.claude/agents-memory/quality-lead/`, `.claude/agents-memory/senior-backend-engineer/`, `.claude/agents-memory/senior-frontend-engineer/`

### Files to Create
- `.claude/agents/architecture-lead.md` — new Architecture Lead agent with system architecture, cross-service decomposition, frontend-last phasing
- `.claude/agents/quality-lead.md` — new Quality Lead agent with QA strategy, dispatch framework, synthesis protocol
- `.claude/agents/senior-backend-engineer.md` — new implementation agent for Python/FastAPI code
- `.claude/agents/senior-frontend-engineer.md` — new implementation agent for Next.js/React/TypeScript code

### Files to Rename
- `.claude/agents/product-strategist.md` → `.claude/agents/product-lead.md`
- `.claude/agents-memory/product-strategist/` → `.claude/agents-memory/product-lead/`

### Files to Modify
- `.claude/agents-shared/team-protocol.md` — add dispatch protocol, context object, guardrails, audit trail format, hierarchy definition, update team roster to 20 agents
- `.claude/agents/orchestrator.md` — shift to lead-level dispatch, add direct Agent dispatch of leads, simplify pipeline, add dual-mode dispatch tagging for Product Lead
- `.claude/agents/product-lead.md` (post-rename) — add lead coordination, dual-mode operation, sub-team definition
- `.claude/agents/backend-architect.md` — add hierarchy context (lead: Architecture Lead, tier: 2), clarify architect-only role (no implementation)
- `.claude/agents/frontend-architect.md` — add hierarchy context (lead: Architecture Lead, tier: 2), clarify architect-only role
- `.claude/agents/db-architect.md` — add hierarchy context (lead: Architecture Lead, tier: 2)
- `.claude/agents/remotion-engineer.md` — add hierarchy context (lead: Architecture Lead, tier: 2)
- `.claude/agents/frontend-qa.md` — add hierarchy context (lead: Quality Lead, tier: 2)
- `.claude/agents/backend-qa.md` — add hierarchy context (lead: Quality Lead, tier: 2)
- `.claude/agents/security-auditor.md` — add hierarchy context (lead: Quality Lead, tier: 2)
- `.claude/agents/design-auditor.md` — add hierarchy context (lead: Quality Lead, tier: 2)
- `.claude/agents/performance-engineer.md` — add hierarchy context (lead: Quality Lead, tier: 2)
- `.claude/agents/ui-ux-designer.md` — add hierarchy context (lead: Product Lead, tier: 2)
- `.claude/agents/technical-writer.md` — add hierarchy context (lead: Product Lead, tier: 2)
- `.claude/agents/ml-ai-engineer.md` — add hierarchy context (lead: Product Lead, tier: 2)
- `.claude/agents/devops-engineer.md` — add hierarchy context (staff, tier: 1, direct to orchestrator)
- `.claude/agents/debug-specialist.md` — add hierarchy context (staff, tier: 1, direct to orchestrator)
- `CLAUDE.md` — simplify dispatch loop to orchestrator-only, update agent count to 20, update team description

### Files Unchanged
- `.claude/agents-memory/` existing contents — memory files stay, only product-strategist directory renames
- Individual agent domain expertise — unchanged, only communication protocol and hierarchy context added
- All existing agent tools — unchanged (all agents already have the Agent tool)