chore: claude final touches
This commit is contained in:
@@ -0,0 +1,393 @@
|
||||
# CODEX Refactor Plan
|
||||
|
||||
## Chosen direction
|
||||
|
||||
This plan adopts **Option 1: in-place cleanup inside current boundaries**.
|
||||
|
||||
The goal is to make backend feature work safer and the code easier to read **without moving far from the current implementation**.
|
||||
|
||||
This means:
|
||||
|
||||
- keep the existing module structure (`models.py`, `schemas.py`, `repository.py`, `service.py`, `router.py`)
|
||||
- keep public HTTP routes and response contracts stable
|
||||
- avoid hidden abstractions, base classes, registries, or plugin-style indirection
|
||||
- prefer explicit, local helpers over framework-like refactors
|
||||
- optimize first for **developer readability and safer future changes**
|
||||
|
||||
## Why this option
|
||||
|
||||
The biggest current readability and change-risk hotspot is `cpv3/modules/tasks/service.py`.
|
||||
|
||||
Right now it mixes several responsibilities in one place:
|
||||
|
||||
- broker/bootstrap concerns
|
||||
- Dramatiq actor definitions
|
||||
- task submission flow
|
||||
- job lifecycle side effects
|
||||
- webhook application
|
||||
- artifact persistence
|
||||
- notification sending
|
||||
- cancellation side effects
|
||||
|
||||
The highest concentration of risk is in:
|
||||
|
||||
- `cpv3/modules/tasks/service.py` — especially `TaskService` and `record_webhook_event()`
|
||||
- `cpv3/modules/jobs/router.py` — cancellation path delegates into task orchestration
|
||||
- `cpv3/modules/captions/service.py` — already contains both preset CRUD and Remotion client logic
|
||||
|
||||
A bigger ownership move would be possible later, but it is not the safest first step.
|
||||
|
||||
## Primary objectives
|
||||
|
||||
1. Make task/job orchestration easier to understand.
|
||||
2. Reduce the blast radius of changes in `TaskService`.
|
||||
3. Replace raw internal JSON/dict handling with typed, explicit internal contracts.
|
||||
4. Keep behavior stable for current API consumers and existing module boundaries.
|
||||
5. Improve confidence with focused unit and integration coverage before and during refactoring.
|
||||
|
||||
## Constraints
|
||||
|
||||
### Must keep
|
||||
|
||||
- Current backend module structure.
|
||||
- Current route layout and public API behavior.
|
||||
- Current job types and overall task flow.
|
||||
- Current ownership model at module level.
|
||||
- Explicit code over clever abstractions.
|
||||
|
||||
### Allowed
|
||||
|
||||
- Small internal contract cleanup.
|
||||
- Small private helper extraction inside existing files.
|
||||
- Better naming, smaller methods, and clearer sequencing.
|
||||
- Add or expand tests.
|
||||
|
||||
### Not allowed in this phase
|
||||
|
||||
- New module subdirectories or extra architecture layers.
|
||||
- Generic handler registries.
|
||||
- Base service hierarchies.
|
||||
- Large movement of lifecycle ownership from `tasks` into `jobs`.
|
||||
- Pushing more orchestration into `captions`, `media`, or `transcription` before task contracts are cleaned up.
|
||||
- Broad request/response or frontend-facing contract changes.
|
||||
|
||||
## Refactor scope
|
||||
|
||||
### In scope
|
||||
|
||||
- `cpv3/modules/tasks/service.py`
|
||||
- `cpv3/modules/tasks/schemas.py`
|
||||
- `cpv3/modules/tasks/router.py` where needed for clarity only
|
||||
- `cpv3/modules/jobs/router.py` where needed for clearer delegation only
|
||||
- tests around task submission, webhook handling, cancellation, and caption task behavior
|
||||
|
||||
### Nearby modules to touch carefully if needed
|
||||
|
||||
- `cpv3/modules/captions/service.py`
|
||||
- `cpv3/modules/media/service.py`
|
||||
- `cpv3/modules/transcription/repository.py`
|
||||
- `cpv3/modules/jobs/service.py`
|
||||
|
||||
These are supporting touches only. The refactor should remain centered in `tasks`.
|
||||
|
||||
## Target end state
|
||||
|
||||
After this phase:
|
||||
|
||||
- `TaskService` still owns task orchestration,
|
||||
- but each major step is explicit and small,
|
||||
- stored job payloads are validated through typed internal schemas,
|
||||
- webhook application is broken into readable stages,
|
||||
- repeated submission logic is reduced through simple private helpers,
|
||||
- actor flow is easier to scan,
|
||||
- tests cover the critical seams that future feature work will depend on.
|
||||
|
||||
`tasks/service.py` may still be large after this phase, but it should stop being a "many responsibilities blurred together" file.
|
||||
|
||||
## Main decisions
|
||||
|
||||
### 1. Keep orchestration in `tasks` for now
|
||||
|
||||
Do **not** move job lifecycle ownership into `jobs/service.py` in this phase.
|
||||
|
||||
Reason:
|
||||
|
||||
- it is a valid long-term direction,
|
||||
- but it changes semantic ownership of cancellation and webhook application,
|
||||
- which raises regression risk before the current flow is properly typed and decomposed.
|
||||
|
||||
### 2. Add typed internal payload schemas
|
||||
|
||||
Add explicit internal schemas in `cpv3/modules/tasks/schemas.py` for per-job stored payloads.
|
||||
|
||||
These schemas are for internal readability and validation only.
|
||||
|
||||
They should:
|
||||
|
||||
- represent the current JSON shape stored in `job.input_data` / `job.output_data`
|
||||
- remain compatible with existing persisted data
|
||||
- avoid forcing a database migration in this phase
|
||||
- be explicit per job type rather than generic
|
||||
|
||||
Examples of the kind of models expected:
|
||||
|
||||
- media probe input/output
|
||||
- silence detect input/output
|
||||
- silence apply input/output
|
||||
- media convert input/output
|
||||
- transcription generate input/output
|
||||
- frame extract input/output
|
||||
- captions generate input/output
|
||||
|
||||
Important: the database can still store JSON, but `TaskService` should stop passing raw dicts around when internal meaning matters.
|
||||
|
||||
### 3. Split `record_webhook_event()` into explicit stages
|
||||
|
||||
Refactor `record_webhook_event()` into small private methods with one purpose each.
|
||||
|
||||
Target shape:
|
||||
|
||||
- load and validate job state
|
||||
- ignore terminal jobs when appropriate
|
||||
- apply status update
|
||||
- append job event row
|
||||
- run job-type-specific done hooks
|
||||
- send notification
|
||||
- return updated job
|
||||
|
||||
This keeps behavior in one place while making the sequencing obvious.
|
||||
|
||||
### 4. Reduce duplication only where it is obvious
|
||||
|
||||
Allowed helper extraction:
|
||||
|
||||
- output folder resolution
|
||||
- broker reference update
|
||||
- source file lookup/create flow
|
||||
- artifact persistence helpers
|
||||
- notification dispatch helper
|
||||
- common actor start/fail envelope if the resulting code is still explicit
|
||||
|
||||
Do **not** introduce a generic task framework.
|
||||
|
||||
### 5. Keep domain persistence mostly where it is today
|
||||
|
||||
Do not push transcription/media/captions result persistence back into their modules yet.
|
||||
|
||||
Reason:
|
||||
|
||||
- domain ownership may improve long-term,
|
||||
- but doing it before task contracts are typed risks scattering orchestration across several large files.
|
||||
|
||||
This phase should first make the existing orchestration understandable.
|
||||
|
||||
## Execution phases
|
||||
|
||||
### Phase 0 — Lock behavior with tests
|
||||
|
||||
Before changing the flow, expand tests around the current seams.
|
||||
|
||||
Priority coverage:
|
||||
|
||||
- `tests/unit/test_task_service.py`
|
||||
- duplicate active job reuse
|
||||
- terminal job webhook ignore behavior
|
||||
- webhook event update sequencing
|
||||
- artifact persistence trigger points
|
||||
- cancellation path behavior
|
||||
- `tests/unit/test_caption_tasks.py`
|
||||
- caption render fallback behavior
|
||||
- callback confirmation behavior
|
||||
- cancellation-related safety where relevant
|
||||
- integration coverage where current behavior is externally visible
|
||||
- `tests/integration/test_jobs_endpoints.py`
|
||||
- `tests/integration/test_captions_endpoints.py` if affected
|
||||
|
||||
The goal is not broad test expansion. The goal is to pin the behavior of the exact code being refactored.
|
||||
|
||||
### Phase 1 — Introduce typed internal task payloads
|
||||
|
||||
Add internal schemas in `cpv3/modules/tasks/schemas.py` for job payloads.
|
||||
|
||||
Implementation rules:
|
||||
|
||||
- keep schema names explicit per job type
|
||||
- support current field names
|
||||
- make optional fields tolerant enough for existing rows where necessary
|
||||
- use these schemas at service boundaries where `input_data` and `output_data` are read or produced
|
||||
|
||||
Expected outcome:
|
||||
|
||||
- internal payload meaning becomes visible from types instead of being inferred from raw dict access
|
||||
- future changes become easier to review safely
|
||||
|
||||
### Phase 2 — Decompose webhook application flow
|
||||
|
||||
Refactor `TaskService.record_webhook_event()` into a sequence of small private methods.
|
||||
|
||||
Recommended decomposition:
|
||||
|
||||
- `_load_job_for_webhook(...)`
|
||||
- `_should_ignore_webhook(...)`
|
||||
- `_apply_job_status_update(...)`
|
||||
- `_create_job_event(...)`
|
||||
- `_run_done_side_effects(...)`
|
||||
- `_send_job_notification(...)`
|
||||
|
||||
Method names may differ, but the responsibilities should remain this explicit.
|
||||
|
||||
Important:
|
||||
|
||||
- keep sequencing behavior unchanged
|
||||
- preserve the current order where artifacts are saved before notifications when that order matters
|
||||
|
||||
### Phase 3 — Normalize submission helpers
|
||||
|
||||
Refactor repetitive submit methods without hiding behavior.
|
||||
|
||||
The submit methods should remain explicit, but repeated steps can be centralized where the code becomes clearer:
|
||||
|
||||
- resolve user output folder
|
||||
- prepare actor kwargs
|
||||
- create job + webhook
|
||||
- enqueue actor
|
||||
- persist broker reference
|
||||
- build submit response
|
||||
|
||||
The final submit methods should still read as business flows, not as thin wrappers over a generic registry.
|
||||
|
||||
### Phase 4 — Clean actor envelope duplication
|
||||
|
||||
If still useful after earlier phases, standardize only the repeated envelope around actors:
|
||||
|
||||
- cancellation check
|
||||
- running event emission
|
||||
- success/failure reporting
|
||||
|
||||
This should stay simple and local.
|
||||
|
||||
If the helper makes actor behavior harder to follow, do not extract it.
|
||||
|
||||
### Phase 5 — Reassess, do not auto-expand scope
|
||||
|
||||
After the in-place cleanup is done, reassess whether a second phase is justified.
|
||||
|
||||
Only then consider:
|
||||
|
||||
- moving lifecycle semantics into `jobs`
|
||||
- pushing result persistence into owning modules
|
||||
|
||||
Those are explicitly **not** part of the current plan.
|
||||
|
||||
## File-by-file plan
|
||||
|
||||
### `cpv3/modules/tasks/schemas.py`
|
||||
|
||||
- Add explicit internal payload schemas for job input/output.
|
||||
- Keep public API request/response schemas stable.
|
||||
- Prefer additive internal typing over schema churn.
|
||||
|
||||
### `cpv3/modules/tasks/service.py`
|
||||
|
||||
- Keep this file as the orchestration center for this phase.
|
||||
- Decompose large methods into readable private steps.
|
||||
- Replace raw internal dict usage with typed parsing/serialization where the meaning matters.
|
||||
- Reduce repeated code only through local, explicit helpers.
|
||||
|
||||
### `cpv3/modules/tasks/router.py`
|
||||
|
||||
- Keep routes unchanged.
|
||||
- Only adjust delegation or error handling if required to support clearer internal service boundaries.
|
||||
|
||||
### `cpv3/modules/jobs/router.py`
|
||||
|
||||
- Keep the current cancel entrypoint behavior.
|
||||
- If touched, only make delegation clearer.
|
||||
- Do not move cancel semantics into `jobs/service.py` in this phase.
|
||||
|
||||
### `cpv3/modules/captions/service.py`
|
||||
|
||||
- Keep current ownership.
|
||||
- Do not add more orchestration here in phase 1.
|
||||
- Only make supporting adjustments if typed payload work needs a clearer boundary.
|
||||
|
||||
### `cpv3/modules/media/service.py`
|
||||
|
||||
- Keep media processing helpers as-is unless task cleanup reveals a very small, obvious extraction need.
|
||||
- Avoid expanding this file into a second orchestration center.
|
||||
|
||||
## Risk controls
|
||||
|
||||
### Preserve persisted JSON compatibility
|
||||
|
||||
Existing jobs may already contain payloads in the database.
|
||||
|
||||
Therefore:
|
||||
|
||||
- typed internal schemas must support current persisted shapes
|
||||
- avoid renaming stored keys unless there is a very strong reason
|
||||
- prefer backward-compatible parsing over cleanup for cleanup’s sake
|
||||
|
||||
### Preserve side-effect ordering
|
||||
|
||||
Current flow saves some artifacts before notification dispatch.
|
||||
|
||||
That ordering matters because clients may refetch immediately after notification.
|
||||
|
||||
Do not change this ordering unless there is a proven reason and test coverage for the new behavior.
|
||||
|
||||
### Avoid ownership churn during cleanup
|
||||
|
||||
This phase is about readability and safer edits, not about redistributing the architecture.
|
||||
|
||||
If a change starts to feel like "this really belongs in another module," mark it for reassessment after phase 1 instead of expanding scope immediately.
|
||||
|
||||
## Verification strategy
|
||||
|
||||
For implementation of this plan, verification should include:
|
||||
|
||||
- targeted unit tests for `TaskService`
|
||||
- targeted unit tests for caption task actor behavior
|
||||
- relevant integration tests for jobs/tasks endpoints
|
||||
- Ruff checks for touched files
|
||||
|
||||
The most important verification is behavioral:
|
||||
|
||||
- task submission still works
|
||||
- duplicate job reuse still works
|
||||
- webhook updates still drive correct job state
|
||||
- artifact persistence still happens before notification where required
|
||||
- cancellation still behaves as before
|
||||
|
||||
## Definition of done for this phase
|
||||
|
||||
Phase 1 is complete when all of the following are true:
|
||||
|
||||
- internal job payload handling is typed where it matters
|
||||
- `record_webhook_event()` is decomposed into explicit, readable steps
|
||||
- repetitive submit flow is reduced without introducing hidden abstractions
|
||||
- tests protect the main task orchestration seams
|
||||
- public APIs remain effectively unchanged
|
||||
- the backend is easier to change safely without rediscovering task flow each time
|
||||
|
||||
## Explicit non-goals
|
||||
|
||||
This plan does **not** include:
|
||||
|
||||
- moving lifecycle ownership into `jobs/service.py`
|
||||
- redesigning the task system
|
||||
- introducing a generic abstraction layer for tasks
|
||||
- moving artifact persistence into every owning domain module
|
||||
- broad route redesign
|
||||
- frontend contract changes
|
||||
- module structure changes
|
||||
|
||||
## Follow-up after this phase
|
||||
|
||||
Once this plan is implemented and stable, reassess whether there is enough remaining pressure to justify a second phase.
|
||||
|
||||
If yes, the next candidate should be:
|
||||
|
||||
- **Option 2** selectively, and only where phase 1 proves that lifecycle ownership is still the main source of friction.
|
||||
|
||||
Until then, keep the refactor intentionally conservative.
|
||||
Reference in New Issue
Block a user