Files
remotion_service/docs/superpowers/specs/2026-03-14-captions-wizard-integration-design.md
Daniil e6bfe7c946 feat: upgrade agent team with browser, MCP, CLI tools, rules, and hooks
- Add Chrome browser access to 6 visual agents (18 tools each)
- Add Playwright access to 2 testing agents (22 tools each)
- Add 4 MCP servers: Postgres Pro, Redis, Lighthouse, Docker (.mcp.json)
- Add 3 new rules: testing.md, security.md, remotion-service.md
- Add Context7 library references to all domain agents
- Add CLI tool instructions per agent (curl, ffprobe, k6, semgrep, etc.)
- Update team protocol with new capabilities column
- Add orchestrator dispatch guidance for new agent capabilities
- Init git repo tracking docs + Claude config only

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 22:46:16 +03:00

219 lines
11 KiB
Markdown

# Captions Wizard Integration — Design Spec
## Context
The backend captions module (`/api/captions/*`) and caption generation task (`/api/tasks/captions-generate/`) are fully implemented but have no frontend UI. This spec covers integrating captions into the Project Wizard as 3 new steps, allowing users to select/manage caption presets, trigger rendering, and view/download the captioned video.
## Requirements
- Add caption-settings, caption-processing, caption-result wizard steps (positions 9-11)
- Full CRUD for caption presets (system + user presets)
- Tab-switch layout: preset selection grid and full-page style editor
- Static preview text in the editor that updates live with style changes
- Reuse ProcessingStep for caption-processing
- Result step: video player + download + re-render button
- All UI text in Russian
## Wizard Step Flow
```
... → subtitle-revision → caption-settings → caption-processing → caption-result
```
| Step Key | Label | Component | New? |
|----------|-------|-----------|------|
| `caption-settings` | Настройка субтитров | `CaptionSettingsStep` | Yes |
| `caption-processing` | Обработка | `ProcessingStep` | Reused |
| `caption-result` | Результат | `CaptionResultStep` | Yes |
### Navigation
- `subtitle-revision``caption-settings` (change existing "Завершить проект" button to "Далее" + add `goToStep("caption-settings")` call)
- `caption-settings` "Генерировать" → sets active job → auto-navigates to `caption-processing`
- `caption-processing` job completes → auto-navigates to `caption-result`
- `caption-result` "Перегенерировать" → loops back to `caption-settings`
- `caption-result` "Завершить" → marks completed, wizard finished
## WizardContext Changes
New state fields in `WizardContextValue`:
```typescript
captionPresetId: string | null // Selected preset UUID
captionStyleConfig: object | null // Inline style override (custom not-yet-saved config)
captionedVideoPath: string | null // S3 path of rendered captioned video
```
These are persisted to `project.workspace_state.wizard` alongside existing fields.
### Auto-advance logic (WizardContext effect)
1. **Update `isJobActive` guard**: Add `currentStep === "caption-processing"` to the polling condition (alongside existing `"processing"` and `"transcription-processing"` checks) so task status polling fires during caption processing.
2. **New CAPTIONS_GENERATE case**: When `activeJobType === "CAPTIONS_GENERATE"` and task status becomes DONE → read `taskStatus.output_data.output_path` to get the captioned video S3 path (this data is NOT available in Redux notifications — it must come from the task status polling response). Store in `captionedVideoPath`, clear active job, navigate to `caption-result`.
### Where `transcription_id` comes from
The `useSubmitCaptionGenerate` hook needs a `transcription_id`. This comes from `transcriptionArtifactId` in WizardContext (set during the transcription flow). The hook reads it from context and passes it in the request body.
## CaptionSettingsStep
Two sub-views controlled by local state (`activeTab: "select" | "editor"`).
### Tab 1: Preset Selection ("Выбор пресета")
**Data**: `api.useQuery("get", "/api/captions/presets/")` → returns system + user presets
**Layout**:
- Grid of preset cards (3 columns)
- Each card:
- Dark preview area with styled "Пример" text (CSS-styled based on `style_config`)
- Preset name below preview
- "Системный" badge for `is_system === true`
- Edit (pencil) + Delete (trash) icon buttons — hidden for system presets
- Last card: "+ Создать пресет" (dashed border, click opens editor)
- Selected card: highlighted border (indigo)
**Footer**: "← Назад" (to subtitle-revision) + "Генерировать →" (disabled until preset selected)
**Actions**:
- Click card → `captionPresetId = preset.id`, highlight
- Click edit → `setActiveTab("editor")`, load preset's `style_config` into form
- Click delete → confirmation dialog → `DELETE /api/captions/presets/{id}/` → invalidate query cache
- Click "+ Создать" → `setActiveTab("editor")`, form with default values
- Click "Генерировать" → call `useSubmitCaptionGenerate()` → on success: `setActiveJob(job_id, "CAPTIONS_GENERATE")`, `markStepCompleted("caption-settings")`, `goToStep("caption-processing")`
### Tab 2: Style Editor ("Редактор стиля")
**Layout**:
- **Top**: Large preview panel (dark bg) — "Пример субтитров" text styled live from form values
- **Middle**: 4 sub-tabs for style config sections
- **Bottom**: Form controls for the active sub-tab
- **Footer**: "Отмена" (back to Tab 1) + "Сохранить пресет" (create or update)
**Sub-tabs and controls**:
| Sub-tab | Field | Control |
|---------|-------|---------|
| Текст | font_family | Select (Lobster, Inter, Roboto, Montserrat, etc. — include Lobster as it's the backend default) |
| Текст | font_size | Slider (16-96px) |
| Текст | font_weight | Select (400: Обычный / 700: Жирный) — numeric values, backend expects `int` |
| Текст | text_color | Color picker |
| Текст | highlight_color | Color picker |
| Текст | text_shadow | Toggle + text input |
| Текст | text_stroke_width | Number input (0-5px) |
| Текст | text_stroke_color | Color picker |
| Позиция | vertical_position | Select (top / center / bottom) |
| Позиция | horizontal_alignment | Select (left / center / right) |
| Позиция | padding_px | Number input |
| Позиция | max_width_pct | Slider (20-100%) |
| Позиция | lines_per_screen | Number input (1-4) |
| Анимация | highlight_style | Select (color / scale / underline / color_scale) |
| Анимация | highlight_scale | Slider (1.0-2.0) |
| Анимация | segment_transition | Select (fade / slide / none) |
| Анимация | fade_duration_frames | Number input |
| Анимация | animation_speed | Slider (0.5-2.0) |
| Фон | bg_color | Color picker |
| Фон | bg_blur_px | Number input (0-20) |
| Фон | bg_glow_color | Color picker |
| Фон | bg_border_radius_px | Number input (0-24) |
| Фон | bg_padding_px | Number input (0-32) |
**Form management**: `react-hook-form` with nested `CaptionStyleConfig` shape. Form field paths use the nested structure matching the backend schema: `text.font_family`, `text.font_size`, `layout.vertical_position`, `animation.highlight_style`, `background.bg_color`, etc. Preview panel applies form values as inline CSS.
**Save flow**:
- If editing existing preset → `PATCH /api/captions/presets/{id}/` with name + style_config
- If creating new → name input + `POST /api/captions/presets/` with name + style_config
- On success: invalidate presets query, switch back to Tab 1, auto-select the new/updated preset
## CaptionResultStep
**Data source**: `captionedVideoPath` from WizardContext → `GET /api/files/get_file/?file_path={path}` to get presigned URL
**Layout**:
- Full-width video player (Vidstack MediaPlayer) with the captioned video
- Info bar: file name, duration
- Action buttons:
- "Скачать" — triggers browser download of the presigned S3 URL
- "Перегенерировать" — `goToStep("caption-settings")` to re-render with different preset
- "Завершить" — `markStepCompleted("caption-result")`, wizard done
## ProcessingStep Integration
ProcessingStep already reads `activeJobType` and shows different labels. Add to the `JOB_TYPE_LABELS` map:
```typescript
"CAPTIONS_GENERATE": "ГЕНЕРАЦИЯ СУБТИТРОВ"
```
Auto-advance logic in WizardContext needs a new case:
- When `CAPTIONS_GENERATE` job is DONE → extract captioned video path from job output, store in `captionedVideoPath`, navigate to `caption-result`
## API Hooks (New Files)
### `useSubmitCaptionGenerate.ts`
```typescript
// POST /api/tasks/captions-generate/
// Body: { video_s3_path, folder: "output_files", transcription_id, project_id, preset_id?, style_config? }
// Returns: { job_id, status }
```
### `useCaptionPresets.ts`
```typescript
// GET /api/captions/presets/ → list of CaptionPresetRead
// POST /api/captions/presets/ → CaptionPresetCreate → CaptionPresetRead
// PATCH /api/captions/presets/{id}/ → CaptionPresetUpdate → CaptionPresetRead
// DELETE /api/captions/presets/{id}/ → 204
```
## File Structure
```
src/features/project/
├── CaptionSettingsStep/
│ ├── index.ts
│ ├── CaptionSettingsStep.tsx # Main component with tab logic
│ ├── PresetGrid.tsx # Tab 1: preset cards grid
│ ├── StyleEditor.tsx # Tab 2: full style editor
│ ├── StylePreview.tsx # Live preview panel
│ ├── useCaptionPresets.ts # Query + mutations for presets
│ └── useSubmitCaptionGenerate.ts # Caption generation mutation
├── CaptionResultStep/
│ ├── index.ts
│ └── CaptionResultStep.tsx # Video player + download + re-render
```
## Files to Modify
| File | Change |
|------|--------|
| `src/shared/context/WizardContext.tsx` | Add 3 step keys, 3 state fields, auto-advance for CAPTIONS_GENERATE |
| `src/widgets/ProjectWizard/ProjectWizard.tsx` | Add steps to WIZARD_STEPS array and STEP_COMPONENTS map |
| `src/features/project/ProcessingStep/ProcessingStep.tsx` | Add "CAPTIONS_GENERATE" to JOB_TYPE_LABELS |
| `src/features/project/SubtitleRevisionStep/SubtitleRevisionStep.tsx` | Change "Завершить проект" button to "Далее" and add `goToStep("caption-settings")` navigation (currently has no forward navigation, only `markStepCompleted`) |
| `src/shared/api/__generated__/openapi.types.ts` | Regenerate via `bun run gen:api-types` |
## Prerequisites
1. Run `bun run gen:api-types` with backend running to get latest captions preset types
2. Verify backend `/api/captions/presets/` endpoint is accessible
## Error Handling
- **Caption generation fails (FAILED status)**: ProcessingStep already shows failure state with danger-colored progress. User clicks "Назад" (`goBack()`) → navigates back to `caption-settings` to re-submit.
- **Preset delete fails (403)**: Show error toast — system presets cannot be deleted.
- **Preset save fails (validation)**: Display field-level errors from API response.
- **Result video URL expired**: Re-fetch presigned URL on player error via retry.
## Verification
1. Navigate to an existing project that has completed subtitle-revision
2. After subtitle-revision, wizard should advance to "Настройка субтитров"
3. Verify system presets (Классические, Неон, Минимализм) appear in the grid
4. Create a custom preset via the style editor, verify it appears in grid
5. Edit and delete the custom preset, verify CRUD works
6. Select a preset and click "Генерировать" → verify navigation to processing step
7. Wait for job completion → verify navigation to result step
8. Verify captioned video plays in the result step
9. Click "Перегенерировать" → verify return to caption-settings
10. Click "Скачать" → verify download works