Files
remotion_service/docs/superpowers/specs/2026-03-21-agent-team-upgrade-design.md
Daniil e6bfe7c946 feat: upgrade agent team with browser, MCP, CLI tools, rules, and hooks
- Add Chrome browser access to 6 visual agents (18 tools each)
- Add Playwright access to 2 testing agents (22 tools each)
- Add 4 MCP servers: Postgres Pro, Redis, Lighthouse, Docker (.mcp.json)
- Add 3 new rules: testing.md, security.md, remotion-service.md
- Add Context7 library references to all domain agents
- Add CLI tool instructions per agent (curl, ffprobe, k6, semgrep, etc.)
- Update team protocol with new capabilities column
- Add orchestrator dispatch guidance for new agent capabilities
- Init git repo tracking docs + Claude config only

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 22:46:16 +03:00

38 KiB

Agent Team Upgrade — Tools, MCPs, Browser Access, Rules & Hooks

Date: 2026-03-21 Status: Draft Scope: Comprehensive upgrade of all 16 agents with domain-specific tools, MCP servers, browser access, Context7 references, new rules, and hooks

Changelog:

  • v1.0 — Initial draft
  • v1.1 — Fixed MCP package names (Postgres→uvx, Redis→uvx, Lighthouse→bunx, Docker→uvx), all Chrome tools to all 6 agents, all Playwright tools to testing agents, bun over node, verified uv run --group syntax, added curl+context7 for Backend QA and Backend Architect, merged .mcp.json, squawk pipe fix, macOS+Telegram notification via channel config, Backend QA full Playwright access
  • v1.2 — Fixed squawk to lint only new migrations (revision range), fixed Telegram token extraction (cut -d= -f2--), added Bash permissions guidance to installation checklist

1. Browser Access Distribution

Claude-in-Chrome (6 agents)

Primary browser tool for visual inspection, console/network debugging, GIF recording. Shares the user's real Chrome session (cookies, auth state).

All Chrome tools granted to all 6 agents: mcp__claude-in-chrome__tabs_context_mcp, mcp__claude-in-chrome__tabs_create_mcp, mcp__claude-in-chrome__navigate, mcp__claude-in-chrome__computer, mcp__claude-in-chrome__read_page, mcp__claude-in-chrome__find, mcp__claude-in-chrome__form_input, mcp__claude-in-chrome__get_page_text, mcp__claude-in-chrome__javascript_tool, mcp__claude-in-chrome__read_console_messages, mcp__claude-in-chrome__read_network_requests, mcp__claude-in-chrome__resize_window, mcp__claude-in-chrome__gif_creator, mcp__claude-in-chrome__upload_image, mcp__claude-in-chrome__shortcuts_execute, mcp__claude-in-chrome__shortcuts_list, mcp__claude-in-chrome__switch_browser, mcp__claude-in-chrome__update_plan

All tools are available to every Chrome agent. Per-agent instructions direct focus to specific tools:

Agent Focus Tools Primary Use Cases
UI/UX Designer gif_creator, resize_window, computer (screenshot) View localhost:3000 after changes, resize to mobile (375x812) / tablet (768x1024) / desktop (1440x900), GIF-record proposed interaction flows
Design Auditor javascript_tool, get_page_text, read_page, resize_window Extract computed styles via getComputedStyle(), cross-reference against _variables.scss tokens, screenshot components at breakpoints, read a11y tree for semantic structure
Debug Specialist read_console_messages, read_network_requests, javascript_tool Navigate to broken page, filter console by "error|warn", filter network by "/api/" for 4xx/5xx, execute diagnostic JS
Frontend Architect read_page, computer (screenshot), resize_window Spot-check Server Component rendering, verify hydration, validate layout after architectural changes
Performance Engineer javascript_tool, read_network_requests, resize_window Execute performance.getEntries() for LCP/FID/CLS, monitor network waterfall for slow /api/ calls, measure TTFB
Product Strategist read_page, find, computer (screenshot), form_input Walk localhost:3000 as new user, assess onboarding/conversion flows, fill forms to test UX, screenshot critical pages, view competitor sites

Chrome Session Protocol (added to all 6 agents):

## Browser Inspection (Claude-in-Chrome)

When your task involves visual inspection or UI debugging:

1. Call `tabs_context_mcp` to discover existing tabs
2. Call `tabs_create_mcp` to create a fresh tab for this session
3. Store the returned tabId — use it for ALL subsequent browser calls
4. Navigate to `http://localhost:3000` (or the relevant URL)

Guidelines:
- Use `read_page` (accessibility tree) as primary page understanding tool
- Use `computer` with action `screenshot` only for visual verification (layout, colors, spacing)
- Before clicking: always screenshot first, then click CENTER of elements
- Filter console messages: always provide a pattern (e.g., "error|warn|Error")
- Filter network requests: use urlPattern "/api/" to avoid noise
- For responsive testing: resize to 375x812 (mobile), 768x1024 (tablet), 1440x900 (desktop)
- Close your tab when done — do not leave orphan tab groups
- NEVER trigger JavaScript alerts/confirms/prompts — they block all browser events

If your task does NOT involve visual inspection, skip browser tools entirely.

Playwright MCP (2 testing agents)

Structured accessibility snapshots, headless execution, cross-browser validation. For test plan design and integration verification only.

All Playwright tools granted to both testing agents: mcp__playwright__browser_click, mcp__playwright__browser_close, mcp__playwright__browser_console_messages, mcp__playwright__browser_drag, mcp__playwright__browser_evaluate, mcp__playwright__browser_file_upload, mcp__playwright__browser_fill_form, mcp__playwright__browser_handle_dialog, mcp__playwright__browser_hover, mcp__playwright__browser_install, mcp__playwright__browser_navigate, mcp__playwright__browser_navigate_back, mcp__playwright__browser_network_requests, mcp__playwright__browser_press_key, mcp__playwright__browser_resize, mcp__playwright__browser_run_code, mcp__playwright__browser_select_option, mcp__playwright__browser_snapshot, mcp__playwright__browser_tabs, mcp__playwright__browser_take_screenshot, mcp__playwright__browser_type, mcp__playwright__browser_wait_for

Agent Primary Use Cases
Frontend QA Snapshot component a11y trees for test selector design, verify data-testid coverage, reproduce edge cases (empty states, error states, loading states), cross-browser validation, file upload testing, drag-and-drop testing, dialog handling
Backend QA Verify frontend-backend integration — navigate authenticated flows, check that API responses render correctly, verify WebSocket notification delivery in UI, run Playwright code snippets via browser_run_code

Playwright Protocol (added to both agents):

## Browser Testing (Playwright MCP)

When verifying UI behavior or designing test plans:

1. Use `browser_snapshot` as your PRIMARY interaction tool (structured a11y tree, ref-based)
2. Use `browser_take_screenshot` only for visual verification — you CANNOT perform actions based on screenshots
3. Prefer `browser_snapshot` with incremental mode for token efficiency on complex pages
4. Use `browser_wait_for` before assertions on async-loaded content
5. Use `browser_console_messages` to check for JS errors during flows
6. Use `browser_network_requests` to verify API calls match expected contracts
7. Use `browser_run_code` for complex multi-step verification (async (page) => { ... })
8. Use `browser_handle_dialog` to accept/dismiss browser dialogs

This is Playwright, not Claude-in-Chrome. Key differences:
- Separate browser instance (does NOT share your login cookies)
- Ref-based interaction (from snapshot), not coordinate-based
- Supports headless mode and cross-browser (Chromium, Firefox, WebKit)
- No GIF recording
- Full Playwright API via browser_run_code

2. MCP Servers

Four new MCP servers, each scoped to specific agents via agent frontmatter tools: field.

Note: Postgres MCP Pro, Redis MCP, and Docker MCP are Python packages (run via uvx). Lighthouse MCP is a Node package (run via bunx). Exact MCP tool names are discovered at runtime after server start — agent frontmatter will list them once servers are running.

2a. Postgres MCP Pro

Server: crystaldba/postgres-mcp (PyPI: postgres-mcp) Connects to: postgresql://postgres:postgres@localhost:5332/cofee Agents: DB Architect, Performance Engineer, Backend Architect

Capabilities used:

  • Live schema inspection — agents verify current DB state without reading models.py
  • pg_stat_statements slow query analysis — Performance Engineer finds N+1 queries
  • Index health checks — unused indexes, missing indexes on foreign keys across 11 modules
  • EXPLAIN ANALYZE execution — DB Architect validates query plans for the 11-module schema

2b. Redis MCP

Server: redis/mcp-redis (PyPI: redis-mcp-server) Connects to: redis://localhost:6379 Agents: Backend Architect, Debug Specialist

Capabilities used:

  • Dramatiq queue inspection — see pending/failed transcription and render jobs, queue depths
  • Pub/sub channel monitoring — debug WebSocket notification delivery (when job_type === "TRANSCRIPTION_GENERATE" notifications don't arrive)
  • Key inspection — check task state, verify job progress tracking

2c. Lighthouse MCP

Server: danielsogl/lighthouse-mcp-server (npm: @danielsogl/lighthouse-mcp) Audits: Any URL (passed as tool parameter per invocation, not config-level) Agents: Performance Engineer, Design Auditor

Capabilities used:

  • Core Web Vitals (LCP, FID, CLS) with structured JSON — not just a score, but actionable breakdown
  • Accessibility audit (WCAG 2.1 AA) — Design Auditor uses alongside visual Chrome inspection and pa11y
  • Performance budget checking — catch regressions when new dependencies are added

2d. Docker MCP

Server: ckreiling/mcp-server-docker (PyPI: mcp-server-docker) Connects to: Docker socket Agents: DevOps Engineer

Capabilities used:

  • Container health checks across compose stack (postgres, redis, minio, api, worker, remotion)
  • Log tailing per container — debug worker crashes, Remotion render failures
  • Container restart — recover from stuck services
  • Compose stack management — start/stop service groups

Complete .mcp.json (project root)

{
  "mcpServers": {
    "postgres": {
      "command": "uvx",
      "args": ["postgres-mcp", "--access-mode=unrestricted"],
      "env": {
        "DATABASE_URI": "postgresql://postgres:postgres@localhost:5332/cofee"
      }
    },
    "redis": {
      "command": "uvx",
      "args": ["--from", "redis-mcp-server@latest", "redis-mcp-server", "--url", "redis://localhost:6379/0"]
    },
    "lighthouse": {
      "command": "bunx",
      "args": ["@danielsogl/lighthouse-mcp@latest"]
    },
    "docker": {
      "command": "uvx",
      "args": ["mcp-server-docker"]
    }
  }
}

3. CLI Tools

3a. Python Tools — uv dependency group

Add to cofee_backend/pyproject.toml under [dependency-groups]:

[dependency-groups]
tools = [
    "semgrep",
    "bandit",
    "pip-audit",
    "schemathesis",
    "radon",
]

Install: cd cofee_backend && uv sync --group tools

Agents invoke with cd cofee_backend && uv run --group tools <tool> ...

(uv run --group is a valid flag — it includes the specified dependency group for the run without needing a prior uv sync --group.)

3b. Node Tools — bunx (zero-install)

No installation needed. Agents invoke directly:

Tool Command Agent
pa11y bunx pa11y http://localhost:3000 --standard WCAG2AA --reporter json Design Auditor
knip cd cofee_frontend && bunx knip --include files,exports,dependencies Frontend Architect, Design Auditor
squawk cd cofee_backend && uv run alembic upgrade <prev>:head --sql 2>/dev/null | bunx squawk DB Architect

Note: Alembic migrations are .py files, not .sql. The pipe pattern (alembic --sql | squawk) outputs SQL to stdout for squawk to lint.

3c. Brew Binaries

brew install gitleaks k6 hyperfine
Tool Command Agent
gitleaks gitleaks detect --source . --report-format json --no-banner Security Auditor
k6 k6 run --vus 50 --duration 30s <script>.js Performance Engineer
hyperfine hyperfine 'bun run build' --warmup 1 Performance Engineer

3d. Agent-Specific CLI Instructions

Each agent gets concrete commands in their instructions, not generic "use tool X":

Security Auditor:

## Security Scanning Tools

Run these from the project root via Bash:

### Python SAST (backend)
cd cofee_backend && uv run --group tools semgrep scan --config p/python --config p/jwt cpv3/
cd cofee_backend && uv run --group tools bandit -r cpv3/ -ll  # medium+ severity only

### Python dependency vulnerabilities
cd cofee_backend && uv run --group tools pip-audit

### Frontend SAST
Note: semgrep is installed in the backend's uv tools group but scans any language.
cd cofee_backend && uv run --group tools semgrep scan --config p/typescript --include "*.ts" --include "*.tsx" ../cofee_frontend/src/

### Secret detection (git history)
gitleaks detect --source . --report-format json --no-banner

All tools are installed project-locally (Python via uv tools group) or via brew (gitleaks).
Do NOT install new tools — use only what is listed above.

Backend QA:

## API Fuzzing

Property-based testing against the FastAPI OpenAPI schema:
cd cofee_backend && uv run --group tools schemathesis run http://localhost:8000/api/schema/ --checks all --workers 4

This auto-generates edge-case payloads for all 11 module endpoints.
Requires the backend to be running (docker-compose up or uv run uvicorn).

## API Testing with curl

For quick endpoint verification and contract testing, use curl with proper headers:

### Authenticated request (replace <token> with a valid JWT)
curl -s -H "Authorization: Bearer <token>" -H "Content-Type: application/json" http://localhost:8000/api/projects/ | python3 -m json.tool

### POST with JSON body
curl -s -X POST -H "Authorization: Bearer <token>" -H "Content-Type: application/json" -d '{"name": "test"}' http://localhost:8000/api/projects/ | python3 -m json.tool

### Measure response time
curl -o /dev/null -s -w "HTTP %{http_code} in %{time_total}s\n" -H "Authorization: Bearer <token>" http://localhost:8000/api/projects/

### Health check
curl -s http://localhost:8000/api/system/health | python3 -m json.tool

Always include Authorization header for protected endpoints. Use -s (silent) and pipe through python3 -m json.tool for readable output.

Backend Architect:

## Code Complexity Analysis

Check cyclomatic complexity of service files (your "when in doubt, put logic in service.py" rule means these grow):
cd cofee_backend && uv run --group tools radon cc cpv3/modules/*/service.py -a -nc

Grade C or worse = too complex, recommend extraction.

## API Testing with curl

Verify endpoints you've designed or modified:

### Authenticated request
curl -s -H "Authorization: Bearer <token>" -H "Content-Type: application/json" http://localhost:8000/api/<endpoint>/ | python3 -m json.tool

### POST with JSON body
curl -s -X POST -H "Authorization: Bearer <token>" -H "Content-Type: application/json" -d '{"key": "value"}' http://localhost:8000/api/<endpoint>/ | python3 -m json.tool

### Measure response time
curl -o /dev/null -s -w "HTTP %{http_code} in %{time_total}s\n" -H "Authorization: Bearer <token>" http://localhost:8000/api/<endpoint>/

Always test your endpoint changes before finalizing recommendations.

## MinIO / S3 Browsing

Browse uploaded videos and rendered outputs:
aws s3 ls --endpoint-url http://localhost:9000 s3://cofee-media/ --recursive
aws s3 ls --endpoint-url http://localhost:9000 s3://cofee-renders/

Requires AWS CLI configured with MinIO credentials (see .env).

DB Architect:

## Migration Linting

Before approving any Alembic migration, lint the generated SQL:
cd cofee_backend && uv run alembic upgrade <prev>:head --sql 2>/dev/null | bunx squawk

Replace `<prev>` with the revision ID before the new migration (find it with `uv run alembic history`).
Squawk catches unsafe patterns: adding NOT NULL without default, CREATE INDEX without CONCURRENTLY, dropping columns with dependent views.
Do NOT lint all migrations from base — only lint the new one.

Remotion Engineer:

## Video Inspection Tools

Validate input video before Remotion render:
ffprobe -v quiet -print_format json -show_format -show_streams /path/to/input.mp4

Check output after render (verify caption overlay, resolution, codec):
ffprobe -v quiet -print_format json -show_entries stream=width,height,r_frame_rate,codec_name /path/to/output.mp4

Extract specific frame to verify caption positioning:
ffmpeg -i /path/to/output.mp4 -vf "select=eq(n\,100)" -frames:v 1 /tmp/frame_100.png

Get container metadata (duration, bitrate, audio channels):
mediainfo --Output=JSON /path/to/video.mp4

Performance Engineer:

## Load Testing

Load-test the transcription endpoint under concurrent video submissions:
k6 run --vus 50 --duration 30s <script>.js

Benchmark build times:
hyperfine 'cd cofee_frontend && bun run build' --warmup 1
hyperfine 'cd cofee_backend && uv run pytest tests/' --min-runs 3

DevOps Engineer:

## MinIO / S3 Browsing

Browse and verify storage contents:
aws s3 ls --endpoint-url http://localhost:9000 s3://cofee-media/ --recursive

Requires AWS CLI configured with MinIO credentials (see .env).

4. Context7 Library References

Each agent gets specific library IDs in their instructions for targeted documentation lookup.

Instruction block added to each agent:

## Context7 Documentation Lookup

When you need current API docs, use these pre-resolved library IDs:
- mcp__context7__resolve-library-id is NOT needed for these — call query-docs directly.

<agent-specific library table here>

Example: mcp__context7__query-docs with libraryId="/vercel/next.js" and topic="app router server components"

Note: Library IDs may change over time. If query-docs returns no results for a known library, fall back to resolve-library-id to get the current ID.
Agent Libraries
Frontend Architect /vercel/next.js (App Router, Server Components), /tanstack/query (v5 hooks, queries, mutations), /websites/radix-ui_primitives (component APIs, slot structure)
Backend Architect /websites/fastapi_tiangolo (dependency injection, middleware), /websites/sqlalchemy_en_21 (async sessions, relationships), /pydantic/pydantic (v2 validators, model_config), /bogdanp/dramatiq (actors, middleware, retry)
DB Architect /websites/sqlalchemy_en_21 (Alembic, DDL, type system), /websites/sqlalchemy_en_20_orm (relationship loading, hybrid properties)
Remotion Engineer /websites/remotion_dev (interpolate, spring, composition config), /remotion-dev/remotion (bundle, render CLI), /remotion-dev/skills (best practices)
Frontend QA /websites/playwright_dev (locators, expect, fixtures), /microsoft/playwright (test config, reporters), /tanstack/query (testing patterns)
Backend QA /websites/fastapi_tiangolo (TestClient, dependency overrides), /pydantic/pydantic (schema edge cases), /bogdanp/dramatiq (test broker, StubBroker). For curl patterns, use resolve-library-id with query "curl" if needed.
Performance Engineer /vercel/next.js (caching, ISR, static generation), /websites/fastapi_tiangolo (middleware, async patterns), /redis/redis-py (connection pooling, pipelines)
Security Auditor /websites/fastapi_tiangolo (OAuth2, JWT, Security dependencies), /pydantic/pydantic (strict mode, input validation)
ML/AI Engineer /websites/fastapi_tiangolo (BackgroundTasks, streaming), /bogdanp/dramatiq (actor retry, timeout, priority)
DevOps Engineer /vercel/next.js (standalone output, Docker build), /websites/fastapi_tiangolo (workers, deployment settings)
UI/UX Designer /websites/radix-ui_primitives (available components, API constraints)
Design Auditor /websites/radix-ui_primitives (correct props, slot structure, accessibility)
Orchestrator Generic access — queries ad-hoc based on task domain
Technical Writer Generic access — queries based on documentation target
Product Strategist Generic access — queries based on feature research

5. New Rules Files

5a. .claude/rules/testing.md (no path scope — universal)

# Testing Conventions

## Backend Tests
- Real DB + real Redis. No mocks. conftest.py has shared fixtures.
- Location: cofee_backend/tests/integration/<module>.py
- Naming: test_<action>_<scenario> (e.g., test_create_project_without_name)
- Run: cd cofee_backend && uv run pytest
- Single test: uv run pytest -k "test_name"
- API fuzzing: cd cofee_backend && uv run --group tools schemathesis run http://localhost:8000/api/schema/ --checks all

## Frontend E2E Tests
- Playwright with data-testid selectors on every interactive element
- Location: cofee_frontend/tests/
- Run: cd cofee_frontend && bun run test:e2e
- Every component root element must have data-testid

## General
- Never mock the database — use real test DB
- Tests must be deterministic — no Date.now(), no Math.random()
- Test error paths, not just happy paths

5b. .claude/rules/security.md (no path scope — universal)

# Security Conventions

## Authentication
- JWT tokens via get_current_user dependency injection
- Passwords: bcrypt hash, never plain text
- Token refresh: handled by users module

## File Uploads
- Validated by extension + MIME type in files module
- Upload via uploadFile() from @shared/api/uploadFile — never raw FormData
- Endpoint: /api/files/upload/

## Secrets Management
- All config via get_settings() (cached @lru_cache) — never hardcode
- S3/MinIO credentials: env vars only, never in code or commits
- JWT secret: env var, never in code

## Data Protection
- Soft deletes: is_deleted flag — ensure deleted records never leak through API responses
- CORS: configured in main.py — restrict to frontend origin in production
- SQL injection: prevented by SQLAlchemy parameterized queries — never use raw SQL strings
- XSS: React auto-escapes — never use dangerouslySetInnerHTML

## Scanning Tools (for Security Auditor agent)
- Python SAST: semgrep + bandit (via uv run --group tools)
- Dependency CVEs: pip-audit (via uv run --group tools)
- Secret detection: gitleaks (via brew)

5c. .claude/rules/remotion-service.md

---
paths:
  - "remotion_service/**"
---

# Remotion Service Rules

## Animations
- ONLY use Remotion interpolate()/spring() for all animations
- NEVER use CSS transitions, CSS animations, or Framer Motion
- All timing must be frame-based, not time-based

## Compositions
- Deterministic frame rendering: no Date.now(), no Math.random(), no network calls during render
- All data must be passed via inputProps from the server
- useCurrentFrame() and useVideoConfig() for all timing calculations

## Server
- ElysiaJS, single POST /api/render endpoint
- Flow: receive S3 path + transcription → Remotion CLI render → upload to S3 → return path
- Health check: GET /health

## Captions
- All caption presets live in src/components/captions/
- Caption data format: Word[] with start/end timestamps from transcription module

## Video Inspection
- Use ffprobe (installed) to validate input video codec/resolution/fps before render
- Use ffprobe to verify output after render
- Use ffmpeg to extract single frames for visual caption verification
- Use mediainfo for detailed container metadata

6. Hooks

6a. PreCompact — Context Preservation

Added to settings.local.json. Hook stdout is injected into compaction context as a system reminder.

{
  "PreCompact": [
    {
      "matcher": "",
      "hooks": [
        {
          "type": "command",
          "command": "echo 'PRESERVE ACROSS COMPACTION: 1) All modified files and their purposes 2) Test results (pass/fail with commands) 3) Architecture decisions made this session 4) Error messages and resolutions 5) Current subproject (frontend/backend/remotion) 6) Pending agent handoff requests 7) Current task/phase in any active plan'"
        }
      ]
    }
  ]
}

6b. Notification — macOS Desktop Alert + Telegram

Two hooks fire on Notification events. macOS notification fires always. Telegram notification reads bot token and chat ID from the existing Telegram channel config at ~/.claude/channels/telegram/ — no env vars to configure, leverages what's already set up via /telegram:configure. Silently skips if Telegram channel is not configured.

{
  "Notification": [
    {
      "matcher": "",
      "hooks": [
        {
          "type": "command",
          "command": "osascript -e 'display notification \"Claude Code needs your attention\" with title \"Cofee Project\"' 2>/dev/null; exit 0"
        },
        {
          "type": "command",
          "command": "CHAT_ID=$(cat ~/.claude/channels/telegram/access.json 2>/dev/null | python3 -c \"import sys,json; a=json.load(sys.stdin); print(a['allowFrom'][0] if a.get('allowFrom') else '')\" 2>/dev/null) && TOKEN=$(grep TELEGRAM_BOT_TOKEN ~/.claude/channels/telegram/.env 2>/dev/null | cut -d= -f2-) && [ -n \"$CHAT_ID\" ] && [ -n \"$TOKEN\" ] && curl -s -X POST \"https://api.telegram.org/bot$TOKEN/sendMessage\" -d \"chat_id=$CHAT_ID\" -d \"text=Claude Code needs your attention (Cofee Project)\" > /dev/null 2>&1; exit 0"
        }
      ]
    }
  ]
}

6c. Backend Auto-Format Upgrade

Current backend hook runs only ruff check. Upgrade to ruff check --fix + ruff format:

Before:

{
  "type": "command",
  "command": "filepath=$(cat | jq -r '.tool_input.file_path // empty') && case \"$filepath\" in */cofee_backend/cpv3/*.py) cd cofee_backend && uv run ruff check \"$filepath\" 2>&1 | head -20 ;; esac; exit 0"
}

After:

{
  "type": "command",
  "command": "filepath=$(cat | jq -r '.tool_input.file_path // empty') && case \"$filepath\" in */cofee_backend/cpv3/*.py) cd cofee_backend && uv run ruff check --fix \"$filepath\" 2>&1 | head -20 && uv run ruff format \"$filepath\" 2>&1 | head -5 ;; esac; exit 0"
}

7. Per-Agent Instruction Changes

Summary of what changes in each agent's .md file.

7.1 Orchestrator (orchestrator.md)

Changes:

  • Updated team roster table with new capabilities column showing what each agent can now do that it couldn't before
  • Dispatch guidance: "If the task involves visual inspection, include 'Use Chrome browser tools to...' in the agent context"
  • Dispatch guidance: "If the task involves database schema or query performance, dispatch DB Architect who can now inspect the live database via Postgres MCP"
  • Dispatch guidance: "If the task involves Dramatiq job debugging, dispatch Debug Specialist or Backend Architect who can now inspect Redis directly"

7.2 UI/UX Designer (ui-ux-designer.md)

Changes:

  • tools: add all Chrome tools (18 tools, see Section 1)
  • Add Chrome Session Protocol block
  • Add Context7 block with /websites/radix-ui_primitives
  • Add instruction: "When proposing a design, if the dev server is running, navigate to localhost:3000 to see the current UI state before recommending changes"
  • Add instruction: "Use resize_window to verify your proposals work at mobile (375x812), tablet (768x1024), and desktop (1440x900)"
  • Add instruction: "Use gif_creator to record interaction demos when proposing animations or multi-step flows"

7.3 Design Auditor (design-auditor.md)

Changes:

  • tools: add all Chrome tools (18 tools) + Lighthouse MCP tools
  • Add Chrome visual audit protocol
  • Add Context7 block with /websites/radix-ui_primitives
  • Add Lighthouse accessibility audit instructions
  • Add CLI tools block: bunx pa11y for WCAG 2.1 AA, bunx knip for dead FSD exports
  • Add instruction: "Use javascript_tool with getComputedStyle(document.querySelector('[data-testid=\"...\"]')) to extract actual rendered values and compare against _variables.scss tokens"
  • Add instruction: "Cross-reference Lighthouse accessibility issues with visual Chrome inspection — Lighthouse catches ARIA violations, Chrome shows visual presentation"

7.4 Debug Specialist (debug-specialist.md)

Changes:

  • tools: add all Chrome tools (18 tools) + Redis MCP tools
  • Add Chrome debugging protocol
  • Add instruction: "For UI bugs, reproduce in Chrome before investigating code. Navigate to the affected page, interact with it, read console with pattern 'error|warn|Error', and check network requests filtered by '/api/'"
  • Add instruction: "For notification delivery bugs, inspect Redis pub/sub channels directly to determine if the backend published the event"
  • Add instruction: "For stuck Dramatiq jobs, inspect Redis keys to see queue depth and job state"

7.5 Frontend Architect (frontend-architect.md)

Changes:

  • tools: add all Chrome tools (18 tools)
  • Add Chrome spot-check protocol
  • Add Context7 block with /vercel/next.js, /tanstack/query, /websites/radix-ui_primitives
  • Add CLI tools block: bunx knip for dead exports
  • Add instruction: "After recommending architectural changes, spot-check the result in Chrome to verify components render correctly and hydration succeeds"

7.6 Performance Engineer (performance-engineer.md)

Changes:

  • tools: add all Chrome tools (18 tools) + Lighthouse MCP tools + Postgres MCP Pro tools
  • Add Chrome performance protocol
  • Add Context7 block with /vercel/next.js, /websites/fastapi_tiangolo, /redis/redis-py
  • Add Lighthouse audit instructions: "Pass url: 'http://localhost:3000' as a tool parameter to each Lighthouse tool invocation"
  • Add CLI tools block: k6 for load testing, hyperfine for benchmarking
  • Add instruction: "For backend performance, use Postgres MCP Pro to query pg_stat_statements for the slowest queries across the 11 modules"
  • Add instruction: "For frontend performance, run Lighthouse audit first, then use Chrome JS execution for targeted measurements"

7.7 Product Strategist (product-strategist.md)

Changes:

  • tools: add all Chrome tools (18 tools)
  • Add Chrome UX walkthrough protocol
  • Add instruction: "When evaluating the product, navigate localhost:3000 as a first-time user would. Document: what do they see first? What's the path to value? Where is friction?"
  • Add instruction: "When comparing competitors, navigate to competitor sites and screenshot relevant flows"
  • Add instruction: "Use form_input to fill sign-up/onboarding forms and test the conversion funnel end-to-end"

7.8 Frontend QA (frontend-qa.md)

Changes:

  • tools: add all Playwright MCP tools (22 tools, see Section 1)
  • Add Playwright protocol block
  • Add Context7 block with /websites/playwright_dev, /microsoft/playwright, /tanstack/query
  • Add instruction: "Use browser_snapshot to inspect the accessibility tree of components under test. Verify every interactive element has data-testid. Use the snapshot refs to design reliable test selectors"
  • Add instruction: "Reproduce edge cases before recommending tests: navigate to the page, trigger empty states, error states, and loading states via Playwright to confirm the behavior you're testing for"
  • Add instruction: "Use browser_file_upload to test file upload flows, browser_drag for drag-and-drop, browser_handle_dialog for confirmation dialogs"

7.9 Backend QA (backend-qa.md)

Changes:

  • tools: add all Playwright MCP tools (22 tools, see Section 1)
  • Add Playwright protocol block
  • Add Context7 block with /websites/fastapi_tiangolo, /pydantic/pydantic, /bogdanp/dramatiq. For curl, use resolve-library-id with query "curl" if needed.
  • Add CLI tools block: schemathesis commands + curl patterns with headers (see Section 3d)
  • Add instruction: "For integration testing, use Playwright to verify that API responses render correctly in the frontend — navigate to the page, trigger the action, check network requests match expected contracts"
  • Add instruction: "Run schemathesis against /api/schema/ to find endpoints that return 500 errors under edge-case payloads"
  • Add instruction: "Use curl with -H 'Authorization: Bearer ' for quick endpoint verification. Always include Content-Type and Authorization headers for protected endpoints."

7.10 Security Auditor (security-auditor.md)

Changes:

  • No new MCP tools
  • Add Context7 block with /websites/fastapi_tiangolo, /pydantic/pydantic
  • Add CLI tools block: semgrep, bandit, pip-audit, gitleaks commands (see Section 3d)
  • Add instruction: "Start every security review by running the scanning tools. Report findings with severity, file:line, and remediation recommendation"
  • Add instruction: "For the frontend, run semgrep with the typescript config against cofee_frontend/src/ (invoked from cofee_backend/ since semgrep is in the backend tools group)"
  • Add instruction: "Check git history for leaked secrets with gitleaks before any deployment-related review"

7.11 DB Architect (db-architect.md)

Changes:

  • tools: add Postgres MCP Pro tools
  • Add Context7 block with /websites/sqlalchemy_en_21, /websites/sqlalchemy_en_20_orm
  • Add CLI tools block: squawk via pipe pattern
  • Add instruction: "Use Postgres MCP to inspect the live schema rather than reading models.py — the live database is the source of truth, models.py may be out of sync during migration development"
  • Add instruction: "Before approving any Alembic migration, lint with squawk: cd cofee_backend && uv run alembic upgrade head --sql | bunx squawk"
  • Add instruction: "Use pg_stat_statements to identify the slowest queries and recommend index improvements"

7.12 Backend Architect (backend-architect.md)

Changes:

  • tools: add Redis MCP tools + Postgres MCP Pro tools
  • Add Context7 block with /websites/fastapi_tiangolo, /websites/sqlalchemy_en_21, /pydantic/pydantic, /bogdanp/dramatiq
  • Add CLI tools block: radon, curl patterns, MinIO browsing commands (see Section 3d)
  • Add instruction: "Use Redis MCP to inspect Dramatiq queue state when designing or reviewing task processing patterns"
  • Add instruction: "Check service.py complexity with radon — grade C or worse means the file needs extraction into helper functions"
  • Add instruction: "Test your endpoint designs with curl before finalizing recommendations"
  • Add instruction: "Browse MinIO buckets with aws s3 ls --endpoint-url http://localhost:9000 when verifying file storage patterns. Requires AWS CLI configured with MinIO credentials (see .env)."

7.13 Remotion Engineer (remotion-engineer.md)

Changes:

  • No new MCP tools
  • Add Context7 block with /websites/remotion_dev, /remotion-dev/remotion, /remotion-dev/skills
  • Add CLI tools block: ffprobe, mediainfo, ffmpeg commands (see Section 3d)
  • Add instruction: "Validate input video before recommending Remotion composition changes: check codec, resolution, frame rate, and audio streams with ffprobe"
  • Add instruction: "After render, verify output with ffprobe and extract a test frame with ffmpeg to confirm caption overlay positioning"

7.14 DevOps Engineer (devops-engineer.md)

Changes:

  • tools: add Docker MCP tools
  • Add Context7 block with /vercel/next.js, /websites/fastapi_tiangolo
  • Add MinIO browsing via Bash instruction (requires AWS CLI + MinIO credentials from .env)
  • Add instruction: "Use Docker MCP to inspect container health, tail logs, and manage the compose stack instead of crafting docker CLI commands"
  • Add instruction: "For Next.js deployment, query Context7 for standalone output mode and Docker build patterns"

7.15 ML/AI Engineer (ml-ai-engineer.md)

Changes:

  • No new MCP tools, no new CLI tools
  • Add Context7 block with /websites/fastapi_tiangolo, /bogdanp/dramatiq
  • Add instruction: "When modifying transcription actors, query Dramatiq docs for retry/timeout configuration and middleware patterns"

7.16 Technical Writer (technical-writer.md)

Changes:

  • No new MCP tools, no new CLI tools
  • Context7: generic access, queries based on documentation target
  • Add instruction: "When documenting APIs, query the FastAPI docs for the current endpoint decorator patterns to ensure documentation matches implementation"

8. Installation Checklist

One-time setup (run once):

  1. Python tools group:

    cd cofee_backend
    # Add [dependency-groups] tools = [...] to pyproject.toml (see Section 3a)
    uv sync --group tools
    
  2. Brew binaries:

    brew install gitleaks k6 hyperfine
    
  3. MCP servers — create .mcp.json in project root: Use the complete merged config from Section 2. Then add MCP tool permissions to settings.local.json permissions.allow list once tool names are discovered.

  4. Rules files (create 3 new files):

    .claude/rules/testing.md          (content: Section 5a)
    .claude/rules/security.md         (content: Section 5b)
    .claude/rules/remotion-service.md (content: Section 5c)
    
  5. Hooks (update settings.local.json):

    • Add PreCompact hook (Section 6a)
    • Add Notification hook (Section 6b) — Telegram works automatically if channel is configured via /telegram:configure
    • Replace backend ruff hook with upgraded version (Section 6c)
  6. Bash permissions (update settings.local.json permissions.allow): Add these patterns so agents can run new CLI tools without per-invocation prompts:

    "Bash(uv run --group tools:*)",
    "Bash(gitleaks:*)",
    "Bash(k6:*)",
    "Bash(hyperfine:*)",
    "Bash(ffprobe:*)",
    "Bash(ffmpeg:*)",
    "Bash(mediainfo:*)",
    "Bash(aws s3:*)",
    "Bash(bunx pa11y:*)",
    "Bash(bunx knip:*)",
    "Bash(bunx squawk:*)"
    
  7. Agent files (update 16 .md files):

    • Update tools: frontmatter per Section 7
    • Add browser protocol sections (Chrome or Playwright)
    • Add Context7 library reference blocks
    • Add CLI tool instruction blocks

No installation needed:

  • Node CLI tools (pa11y, knip, squawk) — agents use bunx, zero-install
  • Chrome tools — already available via claude-in-chrome MCP
  • Playwright tools — already available via playwright MCP
  • Context7 — already configured
  • Telegram notifications — uses existing channel config from ~/.claude/channels/telegram/

Verification after setup:

After completing installation, verify each MCP server starts correctly:

  1. uvx postgres-mcp --access-mode=unrestricted with DATABASE_URI set — should connect to PostgreSQL
  2. uvx --from redis-mcp-server@latest redis-mcp-server --url redis://localhost:6379/0 — should connect to Redis
  3. bunx @danielsogl/lighthouse-mcp@latest — should start Lighthouse server
  4. uvx mcp-server-docker — should connect to Docker socket

Then dispatch a test task to one agent from each tool category to confirm tools work end-to-end.