Data Model¶

Overview¶

The data model is designed to support a flexible, hierarchical structure for digital product creation projects. It uses PostgreSQL as the primary database, leveraging JSONB fields for flexible schema evolution and UUID primary keys for distributed system compatibility.

Core Entities¶

Projects¶

The top-level entity representing a digital product creation project.

CREATE TABLE projects (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title VARCHAR(255) NOT NULL,
    description TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    user_id UUID NOT NULL,
    metadata JSONB DEFAULT '{}'::jsonb
);

CREATE INDEX idx_projects_user_id ON projects(user_id);
CREATE INDEX idx_projects_created_at ON projects(created_at);

Key Fields:

id: Unique project identifier
title: Human-readable project name
description: Detailed project description, may include script or concept
user_id: Owner of the project
metadata: Flexible field for future extensions

Categories¶

Categories represent major workflow stages within a project (e.g., "Character Generation", "Scene Creation").

CREATE TABLE categories (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    title VARCHAR(255) NOT NULL,
    description TEXT,
    type VARCHAR(20) NOT NULL,
    position INTEGER NOT NULL DEFAULT 0,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    metadata JSONB DEFAULT '{}'::jsonb,
    CONSTRAINT valid_type CHECK (
        type IN ('image', 'video', 'text')
    )
);

CREATE INDEX idx_activities_project_id ON activities(project_id);
CREATE INDEX idx_activities_task_type ON activities(task_type);
CREATE INDEX idx_activities_position ON activities(project_id, position);

Key Fields:

task_type: Defines what type of generation tasks can be created within this activity
position: Order of the activity in the Kanban board
metadata: Activity-specific configuration

Task Cards¶

Individual tasks within an activity, representing specific generation work.

CREATE TABLE task_cards (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    activity_id UUID NOT NULL REFERENCES activities(id) ON DELETE CASCADE,
    title VARCHAR(255) NOT NULL,
    description TEXT,
    generation_params JSONB DEFAULT '{}'::jsonb,
    position INTEGER NOT NULL DEFAULT 0,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    metadata JSONB DEFAULT '{}'::jsonb
);

CREATE INDEX idx_task_cards_activity_id ON task_cards(activity_id);
CREATE INDEX idx_task_cards_position ON task_cards(activity_id, position);
CREATE INDEX idx_task_cards_generation_params ON task_cards USING GIN (generation_params);

Key Fields:

generation_params: Current generation parameters (model, prompt, etc.)
position: Order within the activity column
metadata: Card-specific data

Example generation_params structure:

{
    "model": "stable-diffusion-xl",
    "prompt": "A heroic warrior in ancient armor",
    "negative_prompt": "low quality, blurry",
    "resolution": "1024x1024",
    "guidance_scale": 7.5,
    "num_images": 4
}

Generation Jobs¶

Tracks individual generation job requests and their status.

CREATE TABLE generation_jobs (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    card_id UUID NOT NULL REFERENCES task_cards(id) ON DELETE CASCADE,
    status VARCHAR(50) NOT NULL DEFAULT 'queued',
    params JSONB NOT NULL,
    result_url TEXT,
    error_message TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    started_at TIMESTAMP WITH TIME ZONE,
    completed_at TIMESTAMP WITH TIME ZONE,
    retry_count INTEGER DEFAULT 0,
    metadata JSONB DEFAULT '{}'::jsonb,
    CONSTRAINT valid_status CHECK (
        status IN ('queued', 'processing', 'completed', 'failed', 'cancelled')
    )
);

CREATE INDEX idx_generation_jobs_card_id ON generation_jobs(card_id);
CREATE INDEX idx_generation_jobs_status ON generation_jobs(status);
CREATE INDEX idx_generation_jobs_created_at ON generation_jobs(created_at);

Key Fields:

status: Current job state in the pipeline
params: Snapshot of generation parameters at job creation time
result_url: Location of generated asset (if successful)
error_message: Details if job failed
retry_count: Number of retry attempts

Approved Assets¶

Assets that users have explicitly approved and added to project memory.

CREATE TABLE approved_assets (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    card_id UUID NOT NULL REFERENCES task_cards(id) ON DELETE SET NULL,
    job_id UUID REFERENCES generation_jobs(id) ON DELETE SET NULL,
    url TEXT NOT NULL,
    title VARCHAR(255) NOT NULL,
    description TEXT,
    prompt TEXT,
    asset_type VARCHAR(50) NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    metadata JSONB DEFAULT '{}'::jsonb,
    CONSTRAINT valid_asset_type CHECK (
        asset_type IN ('image', 'video', 'text', 'audio')
    )
);

CREATE INDEX idx_approved_assets_project_id ON approved_assets(project_id);
CREATE INDEX idx_approved_assets_card_id ON approved_assets(card_id);
CREATE INDEX idx_approved_assets_asset_type ON approved_assets(asset_type);

Key Fields:

url: Location of the approved asset
prompt: The prompt used to generate this asset
asset_type: Type of asset for filtering and display

Agent States¶

Persists LangGraph agent conversation states.

CREATE TABLE agent_states (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    agent_type VARCHAR(50) NOT NULL,
    context_id UUID NOT NULL, -- project_id for project agents, card_id for specialized agents
    thread_id VARCHAR(255) NOT NULL, -- LangGraph thread identifier
    state JSONB NOT NULL,
    checkpoint JSONB, -- LangGraph checkpoint data
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT valid_agent_type CHECK (
        agent_type IN ('project_agent', 'image_generation_agent', 'video_generation_agent', 'text_generation_agent')
    )
);

CREATE UNIQUE INDEX idx_agent_states_thread_id ON agent_states(thread_id);
CREATE INDEX idx_agent_states_context_id ON agent_states(context_id);
CREATE INDEX idx_agent_states_agent_type ON agent_states(agent_type);
CREATE INDEX idx_agent_states_updated_at ON agent_states(updated_at);

Key Fields:

context_id: References either project_id or card_id depending on agent type
thread_id: Unique identifier for conversation thread
state: Full conversation state including messages
checkpoint: LangGraph-specific checkpoint data

Chat Messages¶

Stores chat history for audit and context retrieval.

CREATE TABLE chat_messages (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    agent_type VARCHAR(50) NOT NULL,
    context_id UUID NOT NULL, -- project_id or card_id
    user_id UUID NOT NULL,
    role VARCHAR(20) NOT NULL, -- 'user', 'assistant', 'system', 'tool'
    content TEXT NOT NULL,
    metadata JSONB DEFAULT '{}'::jsonb,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT valid_role CHECK (
        role IN ('user', 'assistant', 'system', 'tool')
    )
);

CREATE INDEX idx_chat_messages_context_id ON chat_messages(context_id, created_at);
CREATE INDEX idx_chat_messages_user_id ON chat_messages(user_id);

Entity Relationships¶

erDiagram
    PROJECTS ||--o{ ACTIVITIES : contains
    PROJECTS ||--o{ APPROVED_ASSETS : has
    ACTIVITIES ||--o{ TASK_CARDS : contains
    TASK_CARDS ||--o{ GENERATION_JOBS : creates
    TASK_CARDS ||--o{ APPROVED_ASSETS : produces
    GENERATION_JOBS ||--o| APPROVED_ASSETS : results_in
    PROJECTS ||--o{ AGENT_STATES : has_conversations
    TASK_CARDS ||--o{ AGENT_STATES : has_conversations
    PROJECTS ||--o{ CHAT_MESSAGES : logs
    TASK_CARDS ||--o{ CHAT_MESSAGES : logs

Data Access Patterns¶

Project Memory¶

Purpose: Provides complete project context for the Project Agent.

Required data from tables:

projects: title, description, metadata
activities: id, title, description, task_type, position
task_cards: id, title, description (all cards across project)
approved_assets: id, title, url, asset_type, description, prompt
generation_jobs: Recent jobs with status and results

Card Memory¶

Purpose: Provides task-specific context for Specialized Agents.

Required data from tables:

task_cards: title, description, generation_params, metadata (current card only)
activities: task_type (parent activity's type)
generation_jobs: All jobs for this card with status, params, result_url
approved_assets: Assets created from this card
projects: Basic project context (title, description)

Active Job Monitoring¶

Purpose: Enables the Notification Service to track and update job statuses.

Required data from tables:

generation_jobs: All active jobs (status = 'queued' or 'processing')
task_cards: Card context for each job
activities: To get project_id for routing notifications

Database Optimization¶

Indexes¶

Primary keys use UUID for distributed compatibility
Foreign key indexes for efficient joins
Status indexes for job queue queries
Timestamp indexes for temporal queries
GIN indexes on JSONB fields for flexible querying

Partitioning Strategy (Future)¶

For scale, consider partitioning:

generation_jobs by created_at (monthly partitions)
chat_messages by created_at (monthly partitions)
approved_assets by project_id (hash partitioning)

Connection Pooling¶

Use PgBouncer or similar for connection pooling
Separate read/write connection pools
Configure based on service requirements

Migration Strategy¶

Initial Setup¶

-- Enable UUID extension
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- Create tables in dependency order
-- 1. Projects
-- 2. Activities  
-- 3. Task Cards
-- 4. Generation Jobs
-- 5. Approved Assets
-- 6. Agent States
-- 7. Chat Messages

Version Control¶

Use migration tools like Alembic or Flyway
Maintain forward-only migrations
Test migrations in staging environment
Keep rollback scripts for emergency

Data Integrity¶

Use database constraints where possible
Implement application-level validation
Regular consistency checks
Automated backup strategy

Future Considerations¶

Potential Extensions¶

User Management: Full user/team tables when expanding beyond single user
Permissions: Role-based access control tables
Templates: Reusable project/activity templates
Versioning: Asset version tracking
Analytics: Aggregated metrics tables

Performance Optimization¶

Materialized Views: For complex project memory queries
Caching Layer: Redis for hot data (future addition)
Read Replicas: For scaling read operations
Archival: Move old completed projects to cold storage