Skip to content

Agents Concept and Design

TDD

The Xavier | Agents TDD, v.1.0 document describes a multi-agent system where a Project Agent oversees the workflow and specialized agents handle tasks like image generation. Tasks are organized Kanban-style, and the system runs on Python, FastAPI, PostgreSQL, and Docker. The initial MVP focuses on the core agent setup, basic project and task management, and a simple job queue, giving users a smooth, automated workflow without the usual technical headaches.

Agents are intelligent components designed to manage and orchestrate the digital product creation workflow.

Agent Design Overview

The system utilizes two types of intelligent agents, both fundamentally built on the langgraph framework="" for stateful conversation management. The core design principle is the Dual Tool Architecture, which clearly separates data retrieval from execution actions:

  • Read Operations (Data Retrieval): Handled by the MCP Server (Model Context Protocol).
  • Write Operations (Actions): Handled by the ToolNode (LangGraph Tools).
--- config: layout: elk --- flowchart TD subgraph "Multi-Agent System &nbsp;" PA[Project Agent] SA1[Specialized Agent 1<br>Image Generation] SA2[Specialized Agent 2<br>Video Generation] end subgraph Tools MCP[MCP Server<br>Read-Only] ToolNode[ToolNode<br>Write/Action] end subgraph Data DB[(PostgreSQL / Data Storage)] end %% Agent interactions PA -->|Delegates tasks| SA1 PA -->|Delegates tasks| SA2 SA1 -->|Reads context| MCP SA2 -->|Reads context| MCP SA1 -->|Executes actions| ToolNode SA2 -->|Executes actions| ToolNode %% Tool interactions MCP -->|Fetches data| DB ToolNode -->|Updates data / triggers jobs| DB

The entire process is LLM-driven, meaning the large language model within the agent decides when to fetch context, create new entities, or simply provide a response.

MCP Server vs ToolNode

MCP Server (Model Context Protocol)

Purpose: Provides centralized access to external data sources and context information.

Use Cases:

  • Retrieving available AI models from remote database
  • Fetching project memory (projects, activities, approved assets)
  • Getting card memory (task card details, generation parameters)
  • Accessing model metadata (capabilities, parameters, tags, styles)

Architecture:

  • Separate service with its own database connections
  • Implements caching for frequently accessed data
  • Provides REST API for data retrieval
  • Handles complex queries and data aggregation

Key Characteristics:

  • Read-only operations
  • External data access
  • Caching layer
  • Cross-agent shared resource

ToolNode (LangGraph Tools)

Purpose: Executes actions and operations within the agent workflow.

Use Cases:

  • Creating new activities in projects
  • Creating task cards within activities
  • Updating generation parameters
  • Creating generation jobs
  • Approving assets

Architecture:

  • Integrated into LangGraph agent workflow
  • Direct database operations
  • Stateful operations with transaction support
  • Agent-specific implementations

Key Characteristics:

  • Write operations
  • Business logic execution
  • Stateful actions
  • Agent workflow integration

Usage Decision Matrix

Operation Type Use MCP Server Use ToolNode
Get available models
Get project context
Create activity
Update parameters
Search models by tag
Create generation job

LangGraph Implementation Patterns

Tool Functions Architecture

Tool Definition:

  • Tools are implemented as Python functions with @tool decorator
  • Each tool represents either MCP Server call or ToolNode operation
  • Tools are automatically bound to LLM for selection and execution

MCP Server Tools:

  • Make HTTP requests to MCP Server endpoints
  • Handle data retrieval and context gathering
  • Examples: get_available_models(), search_models_by_style(), get_project_memory()

ToolNode Tools:

  • Perform direct database operations
  • Execute business logic and state changes
  • Examples: create_generation_job(), update_generation_params(), create_activity()

Core Agent Architecture

Agent State:

  • messages: Conversation history
  • project_id: Current project context
  • user_id: User identification

Graph Structure:

  • agent node: LLM processes messages and decides on tool usage
  • tools node: Executes selected tools and returns results
  • Conditional routing: Based on whether LLM makes tool calls

Flow Control:

  1. LLM analyzes user message
  2. If tools needed → route to tools node
  3. Execute tools and return results
  4. Continue until LLM provides final response

State Management Strategy

Conversation History Management:

  • Maintain last 100 messages in agent state
  • Use circular buffer pattern for efficient memory usage
  • Persist state to PostgreSQL via LangGraph's built-in persistence

ConversationManager Architecture:

  • Manages message buffer with configurable size limits
  • Implements circular buffer to prevent memory growth
  • Provides context extraction for LLM processing
  • Integrates with LangGraph state management

Key Responsibilities:

  • Message buffer maintenance and rotation
  • Context formatting for LLM consumption
  • State persistence and retrieval
  • Memory usage optimization

Project Agent Specifications

Purpose

Manages high-level project orchestration, understands user intent, and creates activities/task cards.

Available Tools

MCP Server Tools (Data Retrieval):

  • get_project_memory(): Retrieve complete project context from MCP Server
  • get_user_history(): Get user's previous interactions and preferences

ToolNode Tools (Operations):

  • create_activity(): Create new activity with auto-generated position
  • create_task_card(): Create task card within specified activity
  • reorder_activities(): Change activity positions in Kanban board

Agent Architecture

ProjectAgent Class:

  • Inherits from BaseAgent with project-specific tools
  • Manages high-level workflow orchestration
  • Handles natural language intent analysis
  • Creates project structure based on user requirements

Capabilities

  1. Category Management:

    • Analyze user requirements to determine category types
    • Generate appropriate names and descriptions
    • Set correct task types based on context
  2. Task Card Creation:

    • Break down categories into specific task cards
    • Infer card details from project context
    • Maintain relationships between cards
  3. Workflow Suggestions:

    • Propose next steps based on project state
    • Identify missing components in workflow
    • Suggest optimal task sequences

Prompt Engineering

ProjectAgentPrompts Architecture:

  • System-level prompts define agent role and capabilities
  • Context-aware prompt templates with dynamic variable injection
  • Intent analysis prompts for user message understanding
  • Task-specific prompt variations for different operations

Prompt Categories:

  • System Prompts: Define agent identity and core responsibilities
  • Intent Analysis: Extract user intent and mentioned entities
  • Context Integration: Incorporate project memory and task types
  • Action Planning: Guide decision-making for tool usage

Specialized Agent Specifications

Purpose

Handle domain-specific tasks within task cards, manage generation parameters, and create generation jobs.

Image Generation Agent Example

Available Tools

MCP Server Tools (Data Retrieval):

  • get_available_models(): Query external models database by task type and tags
  • search_models_by_style(): Find models suitable for specific artistic styles
  • get_card_memory(): Retrieve current task card context and parameters
  • get_model_details(): Get detailed model capabilities and parameters

ToolNode Tools (Operations):

  • update_generation_params(): Modify task card generation parameters
  • create_generation_job(): Submit new generation job to queue
  • cancel_job(): Cancel pending generation job
  • approve_asset(): Add generated result to project assets

Agent Architecture

ImageGenerationAgent Class:

  • Inherits from BaseAgent with generation-specific tools
  • Contains domain expertise for image generation models
  • Handles parameter optimization and model selection
  • Manages generation job lifecycle

Interaction Flow Example

  1. User Input: "Generate photorealistic portrait"
  2. Model Discovery: Agent queries MCP for suitable models
  3. Parameter Update: Agent selects optimal model and parameters
  4. Job Creation: Agent submits generation job with parameters
  5. User Response: Agent confirms job creation and provides status

Specialized Knowledge and Model Discovery

MCP Server Architecture

Service Design:

  • Separate FastAPI service independent of agent services
  • Handles connections to external model databases
  • Implements caching layer for frequently accessed data
  • Provides REST API endpoints for data retrieval

Key Responsibilities:

  • External model database connectivity and querying
  • Data aggregation from multiple sources (project DB + external model DB)
  • Response caching to reduce external API calls
  • Filtering and search capabilities for model discovery

Endpoints:

  • /models/{task_type}: Get models by type with optional tag filtering
  • /models/search: Advanced model search with style matching
  • /project-memory/{project_id}: Aggregated project context
  • /card-memory/{card_id}: Task card context and job history

Integration Architecture

Agent ↔ MCP Server Flow:

  1. Agent tool function makes HTTP request to MCP Server
  2. MCP Server checks cache for requested data
  3. If cache miss, MCP Server queries external databases
  4. MCP Server returns structured data to agent tool
  5. Agent processes data and continues workflow

Data Sources:

  • External Model Database: Contains AI model metadata, capabilities, tags
  • Main Application Database: Project, activity, card, and job data
  • Cache Layer: In-memory storage for frequently accessed model data

Generation Parameter Management

Parameter Lifecycle:

  • Default parameters set on card creation
  • User requests modify parameters through natural language
  • Agent validates and updates parameters via ToolNode
  • Parameters stored in task card generation_params JSONB field

Parameter Categories:

  • Model Selection: AI model ID and version
  • Generation Settings: Resolution, guidance scale, steps
  • Content Control: Prompts, negative prompts, style modifiers
  • Output Options: Number of images, format preferences

Validation and Defaults:

  • Each model type has specific parameter constraints
  • Agent validates parameters against model capabilities
  • Fallback to sensible defaults for missing parameters

Job Completion Handling

Completion Flow:

  1. Notification Service receives job completion from cloud service
  2. Service calls agent's job completion endpoint
  3. Agent loads current conversation state
  4. Agent formats completion message with result details
  5. Agent updates conversation history and notifies UI

Completion Types:

  • Success: Include result URL and generation metadata
  • Failure: Include error details and suggested actions
  • Partial: Handle partially successful batch generations

State Management:

  • Agent maintains conversation continuity across job completion
  • Completion messages integrated into natural conversation flow
  • WebSocket notifications sent to keep UI synchronized

Common Agent Patterns

Error Handling

Error Categories:

  • Tool Execution Errors: MCP Server unavailable, database connection issues
  • Validation Errors: Invalid parameters, missing required data
  • External Service Errors: Cloud generation service failures
  • LLM Errors: Token limits, rate limiting, model unavailability

Error Recovery:

  • Graceful degradation when MCP Server is unavailable
  • Retry mechanisms for transient failures
  • User-friendly error messages in natural language
  • Structured error logging for debugging and monitoring

Error Context:

  • Preserve conversation state across error conditions
  • Include error details in agent response without exposing internals
  • Suggest alternative actions when possible

Agent Deployment Considerations

Service Architecture

AgentService Class:

  • FastAPI wrapper around LangGraph agent
  • Manages agent lifecycle and state persistence
  • Handles HTTP endpoints for chat and job completion
  • Integrates with PostgreSQL checkpointer for conversation persistence

Key Endpoints:

  • POST /chat/{project_id}: Process user messages
  • POST /job-completion/{card_id}: Handle job completion callbacks
  • GET /health: Service health and dependency status

State Persistence

LangGraph Integration:

  • PostgresSaver for conversation state persistence
  • Thread-based state isolation per user/project combination
  • Automatic state loading and saving across requests

Configuration:

  • Thread ID format: {project_id}:{user_id} for project agents
  • Thread ID format: {card_id}:{user_id} for specialized agents
  • State includes full conversation history and context

Health Monitoring

Dependency Checks:

  • Database connectivity for state persistence
  • MCP Server availability for data retrieval
  • Message broker connection for notifications
  • LLM service responsiveness

Health Status:

  • Individual component health reporting
  • Overall service health aggregation
  • Version and configuration information