rev 4
This commit is contained in:
@@ -1,600 +1,15 @@
|
||||
# AGENTS.md - AI Coding Guidelines for CofeeProject Backend
|
||||
# AGENTS.md — Coffee Project Backend
|
||||
|
||||
This document provides guidelines and best practices for AI agents working with this codebase.
|
||||
Primary workflow guidance lives in `../AGENTS.md`.
|
||||
|
||||
---
|
||||
Use `./CLAUDE.md` as the service-specific source of truth for:
|
||||
|
||||
## Core Principles
|
||||
- backend commands
|
||||
- module architecture
|
||||
- backend patterns and gotchas
|
||||
|
||||
### 1. Code Should Be Simple, Readable, and Well Supported
|
||||
OpenCode/Codex notes:
|
||||
|
||||
- Write code that humans can understand at first glance
|
||||
- Prefer explicit over implicit behavior
|
||||
- Use clear control flow patterns (avoid deeply nested conditions)
|
||||
- Add docstrings for public functions, classes, and modules
|
||||
- Keep functions short and focused (ideally under 30 lines)
|
||||
|
||||
### 2. Less Overhead Is Better
|
||||
|
||||
- Avoid unnecessary abstractions and over-engineering
|
||||
- Don't add layers of indirection without clear benefit
|
||||
- Prefer direct solutions over clever ones
|
||||
- Minimize dependencies where possible
|
||||
- Use built-in Python features before reaching for external libraries
|
||||
|
||||
### 3. No Magic Values
|
||||
|
||||
- Define constants with meaningful names at module level
|
||||
- Use enums or `Literal` types for fixed sets of values (see `ArtifactTypeEnum` pattern)
|
||||
- Configuration values belong in `Settings` class with explicit defaults
|
||||
- Never hardcode timeouts, limits, or thresholds inline
|
||||
- Store user-facing error messages as module-level constants with `ERROR_` prefix
|
||||
- Example: `ERROR_NO_AUDIO_STREAM = "Файл не содержит аудиодорожки"`
|
||||
|
||||
```python
|
||||
# BAD
|
||||
if silence_db > 16:
|
||||
...
|
||||
|
||||
# GOOD
|
||||
SILENCE_THRESHOLD_DB = 16
|
||||
|
||||
if silence_db > SILENCE_THRESHOLD_DB:
|
||||
...
|
||||
```
|
||||
|
||||
### 4. One Function Should Implement One Purpose
|
||||
|
||||
- Each function should do exactly one thing
|
||||
- If a function needs "and" in its description, split it
|
||||
- Extract helper functions for distinct subtasks
|
||||
- Keep side effects isolated and predictable
|
||||
|
||||
```python
|
||||
# BAD
|
||||
async def get_and_validate_and_process_media(file_key: str) -> MediaResult:
|
||||
...
|
||||
|
||||
# GOOD
|
||||
async def download_media(file_key: str) -> TempFile:
|
||||
...
|
||||
|
||||
def validate_media_format(file_path: str) -> bool:
|
||||
...
|
||||
|
||||
async def process_media(file_path: str) -> MediaResult:
|
||||
...
|
||||
```
|
||||
|
||||
### 5. All Variable Names Should Have Meaning Based on Context
|
||||
|
||||
- Use descriptive names that explain purpose, not type
|
||||
- Avoid single-letter variables (except for trivial loops)
|
||||
- Prefix boolean variables with `is_`, `has_`, `can_`, `should_`
|
||||
- Use domain terminology consistently
|
||||
|
||||
```python
|
||||
# BAD
|
||||
x = await repo.get(id)
|
||||
flag = x.is_deleted
|
||||
|
||||
# GOOD
|
||||
media_file = await media_repository.get_by_id(media_file_id)
|
||||
is_soft_deleted = media_file.is_deleted
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Project Architecture
|
||||
|
||||
### Layer Structure
|
||||
|
||||
```
|
||||
cpv3/
|
||||
├── api/v1/ # API version routing
|
||||
├── common/ # Shared schemas and utilities
|
||||
├── db/ # Database base classes and session
|
||||
├── infrastructure/ # Cross-cutting concerns (auth, storage, settings)
|
||||
└── modules/ # Feature modules (domain logic)
|
||||
└── <module>/
|
||||
├── models.py # SQLAlchemy models
|
||||
├── schemas.py # Pydantic DTOs
|
||||
├── repository.py # Database access layer
|
||||
├── service.py # Business logic
|
||||
└── router.py # FastAPI endpoints
|
||||
```
|
||||
|
||||
### Module Responsibilities
|
||||
|
||||
| Layer | Responsibility | Dependencies |
|
||||
| --------------- | ------------------------------------------ | ----------------------------- |
|
||||
| `router.py` | HTTP request/response handling, validation | schemas, service, repository |
|
||||
| `service.py` | Business logic, orchestration | repository, external services |
|
||||
| `repository.py` | Database queries, CRUD operations | models, session |
|
||||
| `schemas.py` | Data transfer objects, validation | pydantic |
|
||||
| `models.py` | Database table definitions | SQLAlchemy |
|
||||
|
||||
---
|
||||
|
||||
## Coding Standards
|
||||
|
||||
### Python Version & Style
|
||||
|
||||
- **Python 3.11+** required
|
||||
- Use `from __future__ import annotations` for forward references
|
||||
- Line length: **100 characters** (configured in ruff)
|
||||
- Use type hints for all function signatures
|
||||
- Async-first approach for I/O operations
|
||||
|
||||
### Imports
|
||||
|
||||
```python
|
||||
# Standard library
|
||||
from __future__ import annotations
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
from typing import Literal
|
||||
|
||||
# Third-party
|
||||
from fastapi import APIRouter, Depends, HTTPException, status
|
||||
from sqlalchemy.ext.asyncio import AsyncSession
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
# Local imports (absolute paths)
|
||||
from cpv3.infrastructure.auth import get_current_user
|
||||
from cpv3.modules.media.schemas import MediaFileRead
|
||||
from cpv3.modules.media.repository import MediaFileRepository
|
||||
```
|
||||
|
||||
### Pydantic Schemas
|
||||
|
||||
- Inherit from `cpv3.common.schemas.Schema` for consistent config
|
||||
- Use `Literal` types for enums with string values
|
||||
- Suffix schema names: `*Create`, `*Update`, `*Read`
|
||||
|
||||
```python
|
||||
from cpv3.common.schemas import Schema
|
||||
|
||||
class MediaFileRead(Schema):
|
||||
id: UUID
|
||||
owner_id: UUID
|
||||
duration_seconds: float
|
||||
is_deleted: bool
|
||||
created_at: datetime
|
||||
```
|
||||
|
||||
### SQLAlchemy Models
|
||||
|
||||
- Inherit from `Base` and `BaseModelMixin`
|
||||
- Use explicit column types
|
||||
- Add indexes for frequently queried fields
|
||||
- Use soft deletes (`is_deleted` flag)
|
||||
|
||||
```python
|
||||
from cpv3.db.base import Base, BaseModelMixin
|
||||
|
||||
class MediaFile(Base, BaseModelMixin):
|
||||
__tablename__ = "media_files"
|
||||
|
||||
owner_id: Mapped[uuid.UUID] = mapped_column(
|
||||
UUID(as_uuid=True), ForeignKey("users.id", ondelete="RESTRICT"), index=True
|
||||
)
|
||||
is_deleted: Mapped[bool] = mapped_column(Boolean, default=False)
|
||||
```
|
||||
|
||||
### Repository Pattern
|
||||
|
||||
- One repository per model
|
||||
- Accept `AsyncSession` in constructor
|
||||
- Methods should be atomic and focused
|
||||
- Filter soft-deleted records by default
|
||||
|
||||
```python
|
||||
class MediaFileRepository:
|
||||
def __init__(self, session: AsyncSession) -> None:
|
||||
self._session = session
|
||||
|
||||
async def get_by_id(self, media_file_id: uuid.UUID) -> MediaFile | None:
|
||||
result = await self._session.execute(
|
||||
select(MediaFile).where(MediaFile.id == media_file_id)
|
||||
)
|
||||
media_file = result.scalar_one_or_none()
|
||||
if media_file is None or media_file.is_deleted:
|
||||
return None
|
||||
return media_file
|
||||
```
|
||||
|
||||
### FastAPI Endpoints
|
||||
|
||||
- Use dependency injection for DB session, auth, and services
|
||||
- Return typed response models
|
||||
- Use appropriate HTTP status codes
|
||||
- Handle errors with `HTTPException`
|
||||
|
||||
```python
|
||||
@router.get("/mediafiles/{media_file_id}", response_model=MediaFileRead)
|
||||
async def get_mediafile(
|
||||
media_file_id: uuid.UUID,
|
||||
current_user: User = Depends(get_current_user),
|
||||
db: AsyncSession = Depends(get_db),
|
||||
) -> MediaFileRead:
|
||||
repo = MediaFileRepository(db)
|
||||
media_file = await repo.get_by_id(media_file_id)
|
||||
if media_file is None:
|
||||
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Not found")
|
||||
return MediaFileRead.model_validate(media_file)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration & Settings
|
||||
|
||||
### Environment Variables
|
||||
|
||||
- All configuration through `Settings` class in `infrastructure/settings.py`
|
||||
- Use `Field(default=..., alias="ENV_VAR_NAME")` pattern
|
||||
- Provide sensible defaults for local development
|
||||
- Never commit secrets to repository
|
||||
|
||||
```python
|
||||
class Settings(BaseSettings):
|
||||
jwt_secret_key: str = Field(default="dev-secret", alias="JWT_SECRET_KEY")
|
||||
jwt_algorithm: str = Field(default="HS256", alias="JWT_ALGORITHM")
|
||||
jwt_access_ttl_minutes: int = Field(default=60, alias="JWT_ACCESS_TTL_MINUTES")
|
||||
```
|
||||
|
||||
### Accessing Settings
|
||||
|
||||
```python
|
||||
from cpv3.infrastructure.settings import get_settings
|
||||
|
||||
settings = get_settings() # Cached via @lru_cache
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing Guidelines
|
||||
|
||||
### Test Structure
|
||||
|
||||
```
|
||||
tests/
|
||||
├── conftest.py # Shared fixtures
|
||||
├── unit/ # Unit tests (isolated)
|
||||
└── integration/ # Integration tests (with DB/services)
|
||||
```
|
||||
|
||||
### Fixtures
|
||||
|
||||
- Use `pytest-asyncio` for async tests
|
||||
- Create isolated database sessions per test
|
||||
- Mock external services (storage, APIs)
|
||||
|
||||
```python
|
||||
@pytest.fixture
|
||||
async def test_user(test_db_session: AsyncSession) -> User:
|
||||
user = User(
|
||||
id=uuid.uuid4(),
|
||||
username="testuser",
|
||||
email="test@example.com",
|
||||
password_hash=hash_password("testpassword"),
|
||||
is_active=True,
|
||||
)
|
||||
test_db_session.add(user)
|
||||
await test_db_session.commit()
|
||||
return user
|
||||
```
|
||||
|
||||
### Test Naming
|
||||
|
||||
```python
|
||||
# Pattern: test_<action>_<condition>_<expected_result>
|
||||
async def test_get_mediafile_when_not_found_returns_404():
|
||||
...
|
||||
|
||||
async def test_create_mediafile_with_valid_data_returns_201():
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Error Handling
|
||||
|
||||
```python
|
||||
# Use specific HTTP exceptions
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_404_NOT_FOUND,
|
||||
detail="Media file not found"
|
||||
)
|
||||
|
||||
# Re-raise with context
|
||||
try:
|
||||
result = await external_service.call()
|
||||
except ExternalError as e:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_502_BAD_GATEWAY,
|
||||
detail="External service unavailable"
|
||||
) from e
|
||||
```
|
||||
|
||||
### Async Operations
|
||||
|
||||
```python
|
||||
# For CPU-bound work in async context
|
||||
import anyio
|
||||
|
||||
result = await anyio.to_thread.run_sync(cpu_intensive_function, arg1, arg2)
|
||||
|
||||
# For subprocess calls
|
||||
proc = await asyncio.create_subprocess_exec(
|
||||
"ffprobe", "-v", "error", file_path,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE,
|
||||
)
|
||||
stdout, stderr = await proc.communicate()
|
||||
```
|
||||
|
||||
### Temporary Files
|
||||
|
||||
```python
|
||||
from tempfile import NamedTemporaryFile
|
||||
|
||||
with NamedTemporaryFile(suffix=".mp4", delete=False) as tmp:
|
||||
tmp_path = tmp.name
|
||||
try:
|
||||
# Use tmp_path
|
||||
...
|
||||
finally:
|
||||
# Clean up
|
||||
Path(tmp_path).unlink(missing_ok=True)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Do's and Don'ts
|
||||
|
||||
### ✅ DO
|
||||
|
||||
- Use type hints everywhere
|
||||
- Write async code for I/O operations
|
||||
- Use dependency injection
|
||||
- Keep modules self-contained
|
||||
- Write tests for new features
|
||||
- Use meaningful commit messages
|
||||
- Follow existing patterns in the codebase
|
||||
|
||||
### ❌ DON'T
|
||||
|
||||
- Use global mutable state
|
||||
- Put business logic in routers
|
||||
- Hardcode configuration values
|
||||
- Ignore type checker warnings
|
||||
- Write overly clever code
|
||||
- Skip error handling
|
||||
- Mix sync and async DB operations
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Task | Location |
|
||||
| --------------------- | ------------------------------------- |
|
||||
| Add new endpoint | `modules/<module>/router.py` |
|
||||
| Add database model | `modules/<module>/models.py` |
|
||||
| Add validation schema | `modules/<module>/schemas.py` |
|
||||
| Add business logic | `modules/<module>/service.py` |
|
||||
| Add database query | `modules/<module>/repository.py` |
|
||||
| Add configuration | `infrastructure/settings.py` |
|
||||
| Add shared utility | `common/` |
|
||||
| Add migration | Run `alembic revision --autogenerate` |
|
||||
|
||||
---
|
||||
|
||||
## Package Management
|
||||
|
||||
This project uses **[uv](https://docs.astral.sh/uv/)** as the package manager - a fast Python package installer and resolver written in Rust.
|
||||
|
||||
### Common Commands
|
||||
|
||||
```bash
|
||||
# Install all dependencies
|
||||
uv sync
|
||||
|
||||
# Add a new dependency
|
||||
uv add <package-name>
|
||||
|
||||
# Add a dev dependency
|
||||
uv add --group dev <package-name>
|
||||
|
||||
# Run a command in the virtual environment
|
||||
uv run <command>
|
||||
|
||||
# Run the development server
|
||||
uv run uvicorn cpv3.main:app --reload
|
||||
|
||||
# Run tests
|
||||
uv run pytest
|
||||
```
|
||||
|
||||
### Why uv?
|
||||
|
||||
- **Speed** - 10-100x faster than pip
|
||||
- **Reliable** - Deterministic dependency resolution
|
||||
- **Compatible** - Works with standard `pyproject.toml`
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
Key dependencies used in this project:
|
||||
|
||||
- **FastAPI** - Web framework
|
||||
- **SQLAlchemy 2.0** - ORM (async mode)
|
||||
- **Pydantic 2.x** - Data validation
|
||||
- **asyncpg** - PostgreSQL async driver
|
||||
- **Alembic** - Database migrations
|
||||
- **pytest-asyncio** - Async testing
|
||||
- **boto3** - AWS S3 storage
|
||||
- **pydub** - Audio processing
|
||||
- **openai-whisper** - Transcription
|
||||
- **Dramatiq** - Background task queue (with Redis broker)
|
||||
|
||||
---
|
||||
|
||||
## Common AI Agent Mistakes to Avoid
|
||||
|
||||
This section documents real errors made during AI-assisted development sessions. Learn from these mistakes.
|
||||
|
||||
### 1. Over-Engineering and Breaking Module Structure
|
||||
|
||||
**What happened:** When asked to implement background tasks, the agent created excessive files:
|
||||
|
||||
```
|
||||
# BAD - What was created
|
||||
cpv3/modules/tasks/
|
||||
├── __init__.py
|
||||
├── actors.py # ❌ Non-standard
|
||||
├── base.py # ❌ Non-standard
|
||||
├── db_helpers.py # ❌ Non-standard
|
||||
├── webhook_dispatch.py # ❌ Non-standard
|
||||
├── handlers/ # ❌ Non-standard directory
|
||||
│ ├── __init__.py
|
||||
│ ├── base.py
|
||||
│ ├── media_probe.py
|
||||
│ ├── silence_remove.py
|
||||
│ └── ...
|
||||
├── schemas.py
|
||||
├── service.py
|
||||
└── router.py
|
||||
|
||||
# GOOD - Standard module structure
|
||||
cpv3/modules/tasks/
|
||||
├── __init__.py
|
||||
├── schemas.py # DTOs only
|
||||
├── service.py # All business logic including actors
|
||||
└── router.py # Endpoints only
|
||||
```
|
||||
|
||||
**Why it's wrong:**
|
||||
|
||||
- Ignored existing module patterns in the codebase
|
||||
- Added unnecessary abstraction layers (BaseTaskHandler, registry pattern)
|
||||
- Created cognitive overhead for maintainers
|
||||
|
||||
**Advice:**
|
||||
|
||||
- **ALWAYS examine existing modules first** before creating new ones
|
||||
- **Match the existing file naming conventions exactly**
|
||||
- Standard module files: `__init__.py`, `models.py`, `schemas.py`, `repository.py`, `service.py`, `router.py`
|
||||
- Only create files from this list; consolidate everything else into `service.py`
|
||||
|
||||
---
|
||||
|
||||
### 2. Misinterpreting "Make It Flexible" or "Apply SRP"
|
||||
|
||||
**What happened:** When asked to "make tasks module more flexible with SRP compliance", the agent interpreted this as creating:
|
||||
|
||||
- Abstract base classes (`BaseTaskHandler`, `BaseTaskSubmitter`)
|
||||
- A registry pattern with dynamic handler registration
|
||||
- Separate files for each handler implementation
|
||||
- Complex inheritance hierarchies
|
||||
|
||||
**Why it's wrong:**
|
||||
|
||||
- SRP doesn't mean "one class per file" or "maximum abstraction"
|
||||
- Flexibility doesn't mean "prepare for every possible future change"
|
||||
- This violates the project's core principle: **"Less Overhead Is Better"**
|
||||
|
||||
**Advice:**
|
||||
|
||||
- SRP = one function does one thing, NOT one file per concept
|
||||
- "Flexible" = easy to modify, NOT infinitely extensible
|
||||
- When in doubt, keep it in one file and refactor later if needed
|
||||
- Abstract base classes are rarely needed; prefer composition
|
||||
|
||||
```python
|
||||
# BAD - Over-abstracted
|
||||
class BaseTaskHandler(ABC):
|
||||
@abstractmethod
|
||||
async def validate(self, request): ...
|
||||
@abstractmethod
|
||||
async def execute(self, job_id): ...
|
||||
@abstractmethod
|
||||
async def on_complete(self, result): ...
|
||||
|
||||
class MediaProbeHandler(BaseTaskHandler):
|
||||
...
|
||||
|
||||
# GOOD - Simple and direct
|
||||
@dramatiq.actor
|
||||
def media_probe_actor(job_id: str, media_file_id: str) -> None:
|
||||
"""Probe media file for metadata."""
|
||||
# All logic here, no inheritance needed
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. Not Reading AGENTS.md Before Starting
|
||||
|
||||
**What happened:** The agent proceeded with implementation without fully considering the documented principles, particularly:
|
||||
|
||||
- "Avoid unnecessary abstractions and over-engineering"
|
||||
- "Don't add layers of indirection without clear benefit"
|
||||
- "Prefer direct solutions over clever ones"
|
||||
|
||||
**Advice:**
|
||||
|
||||
- **Read AGENTS.md completely before any implementation**
|
||||
- Re-read relevant sections when making architectural decisions
|
||||
- When the user's request conflicts with AGENTS.md principles, ask for clarification
|
||||
|
||||
---
|
||||
|
||||
### 4. Creating Files Without Checking Existing Patterns
|
||||
|
||||
**What happened:** The agent created `handlers/` subdirectory and multiple utility files without checking how other modules handle similar needs.
|
||||
|
||||
**Advice:**
|
||||
|
||||
- Before creating ANY new file, run: `ls cpv3/modules/<similar_module>/`
|
||||
- Check if the functionality can fit into existing standard files
|
||||
- If you need a helper function, put it in `service.py`, not a new file
|
||||
- Subdirectories within modules are almost never appropriate
|
||||
|
||||
---
|
||||
|
||||
### 5. Ignoring the "Quick Reference" Table
|
||||
|
||||
The AGENTS.md contains a clear reference:
|
||||
|
||||
| Task | Location |
|
||||
| --------------------- | -------------------------------- |
|
||||
| Add new endpoint | `modules/<module>/router.py` |
|
||||
| Add database model | `modules/<module>/models.py` |
|
||||
| Add validation schema | `modules/<module>/schemas.py` |
|
||||
| Add business logic | `modules/<module>/service.py` |
|
||||
| Add database query | `modules/<module>/repository.py` |
|
||||
|
||||
**Advice:**
|
||||
|
||||
- Use this table as the ONLY guide for file placement
|
||||
- If something doesn't fit these categories, it probably belongs in `service.py`
|
||||
- Cross-cutting concerns go in `infrastructure/`, not in module subdirectories
|
||||
|
||||
---
|
||||
|
||||
### Summary: The Golden Rules
|
||||
|
||||
1. **Check existing patterns first** - Look at 2-3 similar modules before creating anything
|
||||
2. **Standard files only** - `__init__.py`, `models.py`, `schemas.py`, `repository.py`, `service.py`, `router.py`
|
||||
3. **No subdirectories in modules** - Everything fits in the standard files
|
||||
4. **Consolidate, don't split** - When unsure, put it in `service.py`
|
||||
5. **Simple > Clever** - Direct code beats abstract patterns
|
||||
6. **YAGNI** - Don't build for hypothetical future requirements
|
||||
|
||||
---
|
||||
|
||||
_Last updated: February 2026_
|
||||
- Keep `../AGENTS.md` as the workflow and delegation source of truth.
|
||||
- Treat `CLAUDE.md` as architecture, commands, and conventions only.
|
||||
- Do not rely on `.claude/` directory contents.
|
||||
|
||||
Reference in New Issue
Block a user