feat: rewrite team protocol with hierarchical dispatch, guardrails, and audit trail

2026-03-22 22:39:18 +03:00
parent e3f9cefc24
commit cabab19649
1 changed files with 197 additions and 41 deletions
@@ -14,59 +14,215 @@ Backend modules (11): users, projects, media, files, transcription, captions, jo

 Cross-service flow: Frontend → Backend API (JWT auth) → Dramatiq (Redis) → Remotion → S3 → WebSocket notification back to Frontend.

+## Team Hierarchy
+
+20 agents organized in a 4-tier hierarchy:
+
+    Tier 0: ORCHESTRATOR (Tech Lead)
+             │
+             ├── Tier 1: ARCHITECTURE LEAD (coordinator)
+             │    ├── Tier 2: Backend Architect
+             │    ├── Tier 2: Frontend Architect
+             │    ├── Tier 2: DB Architect
+             │    ├── Tier 2: Remotion Engineer
+             │    ├── Tier 2: Senior Backend Engineer
+             │    └── Tier 2: Senior Frontend Engineer
+             │
+             ├── Tier 1: QUALITY LEAD (coordinator)
+             │    ├── Tier 2: Frontend QA
+             │    ├── Tier 2: Backend QA
+             │    ├── Tier 2: Security Auditor
+             │    ├── Tier 2: Design Auditor
+             │    └── Tier 2: Performance Engineer
+             │
+             ├── Tier 1: PRODUCT LEAD (coordinator, dual-mode)
+             │    ├── Tier 2: UI/UX Designer
+             │    ├── Tier 2: Technical Writer
+             │    └── Tier 2: ML/AI Engineer
+             │
+             ├── Tier 1: DevOps Engineer (staff)
+             └── Tier 1: Debug Specialist (staff)
+
+**Architects** design specs, contracts, and patterns. **Engineers** implement production code from those specs. **Leads** coordinate their sub-team. **Staff** agents are cross-cutting and report directly to the Orchestrator.
+
 ## Team Roster

-| Agent | What they do | New Tools | Request when |
-|-------|-------------|-----------|--------------|
-| **Orchestrator** | Task decomposition, agent routing, context packaging | — | You don't — main session dispatches you |
-| **Frontend Architect** | Next.js/React/FSD patterns, component architecture | Chrome browser, knip | Frontend architecture decisions, component design |
-| **Backend Architect** | FastAPI/Python patterns, service design, API contracts | Redis MCP, Postgres MCP, radon, curl | Backend architecture, API design, module structure decisions |
-| **DB Architect** | PostgreSQL schema, query optimization, migrations | Postgres MCP, squawk | Schema design, query performance, migration strategy |
-| **UI/UX Designer** | Visual design, interaction patterns, premium aesthetics | Chrome browser, GIF recording | New UI flows, design direction, UX patterns |
-| **Design Auditor** | Visual consistency, component compliance, accessibility | Chrome browser, Lighthouse MCP, pa11y, knip | Review existing UI, consistency checks, accessibility audits |
-| **Frontend QA** | Playwright E2E, React testing, edge case discovery | Playwright MCP (all tools) | Frontend test planning, test case design, testing strategy |
-| **Backend QA** | pytest, integration tests, API contracts, edge cases | Playwright MCP, schemathesis, curl | Backend test planning, test case design, testing strategy |
-| **Remotion Engineer** | Compositions, animation, video processing, captions | ffprobe, mediainfo, ffmpeg | Remotion code, video processing, caption styling |
-| **Security Auditor** | OWASP, auth, data protection, dependency auditing | semgrep, bandit, pip-audit, gitleaks | Security review, auth patterns, vulnerability assessment |
-| **Performance Engineer** | Profiling, caching, bundle analysis, query performance | Chrome browser, Lighthouse MCP, Postgres MCP, k6, hyperfine | Performance issues, optimization, load patterns |
-| **Debug Specialist** | Root cause analysis, cross-service debugging | Chrome browser, Redis MCP | Bug investigation, root cause analysis |
-| **DevOps Engineer** | CI/CD, Docker, K8s, infrastructure | Docker MCP | Infrastructure, deployment, CI/CD setup |
-| **Product Strategist** | Monetization, conversion, feature prioritization, growth | Chrome browser | Business decisions, pricing, feature priority |
-| **Technical Writer** | Feature docs, API docs, architecture decision records | — | Documentation needs |
-| **ML/AI Engineer** | Speech-to-text, transcription models, ML deployment | — | Transcription, ML model decisions |
+### Architecture Team (Lead: Architecture Lead)

-## Handoff Format
+| Agent | What they do | Dispatch when |
+|-------|-------------|---------------|
+| **Architecture Lead** | System-level architecture, cross-service decomposition, API contract design | Orchestrator dispatches for architecture concerns |
+| **Backend Architect** | Python/FastAPI design, API patterns, service layer | Backend architecture decisions, module structure |
+| **Frontend Architect** | Next.js/React/FSD patterns, component architecture | Frontend architecture decisions, component design |
+| **DB Architect** | PostgreSQL schema, query optimization, migrations | Schema design, query performance, migration strategy |
+| **Remotion Engineer** | Video compositions, FFmpeg, caption rendering | Remotion code, video processing, caption styling |
+| **Senior Backend Engineer** | Implements Python/FastAPI code from architect specs | Writing backend production code |
+| **Senior Frontend Engineer** | Implements Next.js/React code from architect/design specs | Writing frontend production code |

-When you need another agent's expertise, include this in your output:
+### Quality Team (Lead: Quality Lead)

-```
-## Handoff Requests
+| Agent | What they do | Dispatch when |
+|-------|-------------|---------------|
+| **Quality Lead** | QA strategy, risk-based testing, quality synthesis | Orchestrator dispatches for verification concerns |
+| **Frontend QA** | Playwright E2E, React testing, accessibility | UI components, user flows, browser behavior |
+| **Backend QA** | pytest, integration tests, API contracts | API endpoints, service logic, task queue behavior |
+| **Security Auditor** | OWASP, auth/JWT, dependency CVEs | Auth flows, user input, file uploads, credentials |
+| **Design Auditor** | Visual consistency, component compliance, a11y | UI consistency, design token adherence, accessibility |
+| **Performance Engineer** | Profiling, caching, query optimization, load testing | Slow queries, bundle size, Core Web Vitals |
+
+### Product Team (Lead: Product Lead)
+
+| Agent | What they do | Dispatch when |
+|-------|-------------|---------------|
+| **Product Lead** | SaaS monetization, conversion, feature prioritization | Orchestrator dispatches for product/UX/docs concerns |
+| **UI/UX Designer** | Visual design, interaction patterns, premium aesthetics | New UI flows, design direction, UX patterns |
+| **Technical Writer** | Feature docs, API docs, architecture decision records | Documentation needs |
+| **ML/AI Engineer** | Speech-to-text, transcription models, ML deployment | Transcription, ML model decisions |
+
+### Staff (Direct to Orchestrator)
+
+| Agent | What they do | Dispatch when |
+|-------|-------------|---------------|
+| **DevOps Engineer** | CI/CD, Docker, K8s, infrastructure | Infrastructure, deployment, CI/CD setup |
+| **Debug Specialist** | Root cause analysis, cross-service debugging | Bug investigation, root cause analysis |
+
+## Dispatch Protocol
+
+### Dispatch Context Object
+
+Every agent dispatch MUST include this context block at the top of the prompt:
+
+    DISPATCH CONTEXT:
+      origin_task: "<original task description>"
+      call_chain: ["agent1", "agent2"]
+      current_depth: <number>
+      max_depth: 3
+      initiating_agent: "<who is dispatching>"
+      reason: "<why this agent is needed>"
+
+Depth counting: `current_depth` equals the length of `call_chain`. The orchestrator starts at depth 0. When it dispatches a lead, the lead receives depth 1. The lead dispatches a specialist at depth 2. A specialist dispatches another agent at depth 3 (terminal — no further dispatch). Staff agents follow the same depth rules as leads.
+
+The calling agent appends itself to `call_chain` and increments `current_depth` before dispatching.
+
+### Dispatch Modes
+
+**Full dispatch** — spawn the agent with a complete task. Used by leads dispatching their specialists or for substantial cross-team work.
+
+**Consultation** — a focused question expecting a short answer. Prefix the task with `CONSULTATION (not full task):` and end with `SHORT ANSWER EXPECTED.` Used for cross-team quick checks. All cross-team direct calls from specialists MUST use consultation mode.
+
+Both modes can be parallelized by making multiple `Agent` calls in a single response.
+
+### Receiving a Dispatch
+
+1. Read dispatch context — check `current_depth`, `call_chain`
+2. If `current_depth >= max_depth` — do NOT dispatch further (this includes consultations). Note: "Depth limit reached, could not consult [X]"
+3. If target agent is already in `call_chain` — refuse: "Loop detected: [agent] already in call chain"
+4. Read team protocol and own memory as usual
+5. Execute the task, dispatching sub-agents if needed and depth allows
+6. Return results with audit trail (see Audit Trail section)
+
+## Dispatch Rules
+
+**Rule 1: Preferred path first.** Follow the hierarchy: Orchestrator → Leads → Specialists → back to Lead.
+
+**Rule 2: Direct calls allowed with justification.** Any agent can call any other agent when the question is narrow and specific, going through the hierarchy would add latency without value, and the caller knows exactly which specialist to ask. Cross-team direct calls MUST use consultation mode.
+
+**Rule 3: Never dispatch your own lead.** Specialists never dispatch upward to their lead. If lead-level coordination is needed, return results with a note.
+
+**Rule 4: Never dispatch the orchestrator.** Information flows up through return values, not dispatch calls.
+
+## Guardrails
+
+### Depth Limit (Hard)
+
+Max depth 3, counted as the length of `call_chain`:
+
+| Depth | Who | Can dispatch? |
+|-------|-----|---------------|
+| 1 | Lead / Staff (dispatched by orchestrator) | Yes → Specialists, other leads, staff |
+| 2 | Specialist dispatched by lead | Yes → one more agent (prefer consultation) |
+| 3 | Agent dispatched by specialist | No — terminal |
+
+Agents at depth 2 should prefer consultation mode over full dispatch to preserve context budget. Agents at depth 3 cannot dispatch at all — they use Deferred Consultations (see Escalation Paths).
+
+### Loop Prevention (Hard)
+
+Before dispatching, check:
+
+    if target_agent in call_chain:
+        DO NOT DISPATCH
+        return "Loop detected: {target_agent} already in call chain {call_chain}"
+
+Absolute — no exceptions.
+
+### Cost Control (Soft)
+
+- **Consultation over dispatch.** Single facts or opinions use consultation mode, not full dispatch.
+- **Dispatch justification required.** Every dispatch includes a `reason`. "I might need their input" is not valid.
+- **Lead budget awareness.** Leads prefer 2-3 specialists over dispatching their entire sub-team.
+
+### Quality Gate
+
+An agent should only dispatch when it can articulate:
+
+1. **WHAT** — specific question or task for the target
+2. **WHY** — why it can't answer this itself
+3. **BACK** — what specific deliverable it needs returned
+
+If all three aren't clear, don't dispatch.
+
+## Escalation Paths
+
+| Situation | Action |
+|-----------|--------|
+| Depth limit reached | Return "Deferred Consultation" — lead handles it |
+| Loop detected | Return partial results, note circular dependency |
+| Agent outputs contradict | Return both perspectives — lead resolves or escalates to orchestrator |
+| Task is wrong or out of scope | Return early with "Scope Challenge" — lead decides |
+
+### Deferred Consultation Format
+
+    ## Deferred Consultations

    ### → <Agent Name>
-**Task:** <specific work needed>
-**Context from my analysis:** <what they need to know from your work>
-**I need back:** <specific deliverable>
-**Blocks:** <which part of your work is waiting on this>
-```
+    **Question:** <specific question>
+    **Context:** <what they need to know>
+    **Blocks:** <which part of your work is waiting>

-If you have no handoffs, omit this section entirely.
+### Scope Challenge Format

-## Continuation Format
+    ## Scope Challenge

-You may be invoked in two modes:
+    **Issue:** [What is wrong with the task as given]
+    **Why this matters:** [Impact if we proceed as-is]
+    **Recommendation:** [What should be done instead]
+    **Blocks:** [What work is paused pending resolution]

-**Fresh mode** (default): You receive a task description and context. Start from scratch.
+## Audit Trail

-**Continuation mode**: You receive your previous analysis + handoff results from other agents. Your prompt will contain:
- "Continue your work on: <task>"
- "Your previous analysis: <summary>"
- "Handoff results: <agent outputs>"
+Every agent that made dispatch calls MUST include this section in their output:

-In continuation mode:
-1. Read the handoff results carefully
-2. Do NOT redo your completed work — build on it
-3. Execute your Continuation Plan using the new information
-4. You may produce NEW handoff requests if continuation reveals further dependencies
+    ## Calls Made
+
+    ### 1. → <Agent Name> (<full dispatch|consultation>)
+    **Reason:** <why this agent was needed>
+    **Asked:** "<1-2 sentence summary of the question/task>"
+    **Got back:** "<1-2 sentence summary of the key result>"
+    **Used in:** <which part of your recommendation this informed>
+
+    ### 2. → <Agent Name> (<full dispatch|consultation>, DEFERRED)
+    **Reason:** <why this agent was needed>
+    **Needs:** <what the agent would answer>
+    **Blocks:** <what part of your work is waiting>
+
+Field semantics:
+- **Reason** — maps to the Quality Gate "WHY"
+- **Asked** — the actual question (1-2 sentences, not the full prompt)
+- **Got back** — key takeaway from response (1-2 sentences)
+- **Used in** — which part of the agent's recommendation was informed by this call
+- **DEFERRED** — marks calls blocked by depth/loop limits
+
+When an agent makes no calls, it omits the "Calls Made" section entirely.

 ## Quality Standard