Agents Concept and Design¶

TDD

The Xavier | Agents TDD, v.1.0 document describes a multi-agent system where a Project Agent oversees the workflow and specialized agents handle tasks like image generation. Tasks are organized Kanban-style, and the system runs on Python, FastAPI, PostgreSQL, and Docker. The initial MVP focuses on the core agent setup, basic project and task management, and a simple job queue, giving users a smooth, automated workflow without the usual technical headaches.

Agents are intelligent components designed to manage and orchestrate the digital product creation workflow.

Agent Design Overview¶

The system utilizes two types of intelligent agents, both fundamentally built on the langgraph framework="" for stateful conversation management. The core design principle is the Dual Tool Architecture, which clearly separates data retrieval from execution actions:

Read Operations (Data Retrieval): Handled by the MCP Server (Model Context Protocol).
Write Operations (Actions): Handled by the ToolNode (LangGraph Tools).

The entire process is LLM-driven, meaning the large language model within the agent decides when to fetch context, create new entities, or simply provide a response.

MCP Server vs ToolNode¶

MCP Server (Model Context Protocol)¶

Purpose: Provides centralized access to external data sources and context information.

Use Cases:

Retrieving available AI models from remote database
Fetching project memory (projects, activities, approved assets)
Getting card memory (task card details, generation parameters)
Accessing model metadata (capabilities, parameters, tags, styles)

Architecture:

Separate service with its own database connections
Implements caching for frequently accessed data
Provides REST API for data retrieval
Handles complex queries and data aggregation

Key Characteristics:

Read-only operations
External data access
Caching layer
Cross-agent shared resource

ToolNode (LangGraph Tools)¶

Purpose: Executes actions and operations within the agent workflow.

Use Cases:

Creating new activities in projects
Creating task cards within activities
Updating generation parameters
Creating generation jobs
Approving assets

Architecture:

Integrated into LangGraph agent workflow
Direct database operations
Stateful operations with transaction support
Agent-specific implementations

Key Characteristics:

Write operations
Business logic execution
Stateful actions
Agent workflow integration

Usage Decision Matrix¶

Operation Type	Use MCP Server	Use ToolNode
Get available models	✅	❌
Get project context	✅	❌
Create activity	❌	✅
Update parameters	❌	✅
Search models by tag	✅	❌
Create generation job	❌	✅

LangGraph Implementation Patterns¶

Tool Functions Architecture¶

Tool Definition:

Tools are implemented as Python functions with @tool decorator
Each tool represents either MCP Server call or ToolNode operation
Tools are automatically bound to LLM for selection and execution

MCP Server Tools:

Make HTTP requests to MCP Server endpoints
Handle data retrieval and context gathering
Examples: get_available_models(), search_models_by_style(), get_project_memory()

ToolNode Tools:

Perform direct database operations
Execute business logic and state changes
Examples: create_generation_job(), update_generation_params(), create_activity()

Core Agent Architecture¶

Agent State:

messages: Conversation history
project_id: Current project context
user_id: User identification

Graph Structure:

agent node: LLM processes messages and decides on tool usage
tools node: Executes selected tools and returns results
Conditional routing: Based on whether LLM makes tool calls

Flow Control:

LLM analyzes user message
If tools needed → route to tools node
Execute tools and return results
Continue until LLM provides final response

State Management Strategy¶

Conversation History Management:

Maintain last 100 messages in agent state
Use circular buffer pattern for efficient memory usage
Persist state to PostgreSQL via LangGraph's built-in persistence

ConversationManager Architecture:

Manages message buffer with configurable size limits
Implements circular buffer to prevent memory growth
Provides context extraction for LLM processing
Integrates with LangGraph state management

Key Responsibilities:

Message buffer maintenance and rotation
Context formatting for LLM consumption
State persistence and retrieval
Memory usage optimization

Project Agent Specifications¶

Purpose¶

Manages high-level project orchestration, understands user intent, and creates activities/task cards.

Available Tools¶

MCP Server Tools (Data Retrieval):

get_project_memory(): Retrieve complete project context from MCP Server
get_user_history(): Get user's previous interactions and preferences

ToolNode Tools (Operations):

create_activity(): Create new activity with auto-generated position
create_task_card(): Create task card within specified activity
reorder_activities(): Change activity positions in Kanban board

Agent Architecture¶

ProjectAgent Class:

Inherits from BaseAgent with project-specific tools
Manages high-level workflow orchestration
Handles natural language intent analysis
Creates project structure based on user requirements

Capabilities¶

Category Management:
- Analyze user requirements to determine category types
- Generate appropriate names and descriptions
- Set correct task types based on context
Task Card Creation:
- Break down categories into specific task cards
- Infer card details from project context
- Maintain relationships between cards
Workflow Suggestions:
- Propose next steps based on project state
- Identify missing components in workflow
- Suggest optimal task sequences

Prompt Engineering¶

ProjectAgentPrompts Architecture:

System-level prompts define agent role and capabilities
Context-aware prompt templates with dynamic variable injection
Intent analysis prompts for user message understanding
Task-specific prompt variations for different operations

Prompt Categories:

System Prompts: Define agent identity and core responsibilities
Intent Analysis: Extract user intent and mentioned entities
Context Integration: Incorporate project memory and task types
Action Planning: Guide decision-making for tool usage

Specialized Agent Specifications¶

Purpose¶

Handle domain-specific tasks within task cards, manage generation parameters, and create generation jobs.

Image Generation Agent Example¶

Available Tools¶

MCP Server Tools (Data Retrieval):

get_available_models(): Query external models database by task type and tags
search_models_by_style(): Find models suitable for specific artistic styles
get_card_memory(): Retrieve current task card context and parameters
get_model_details(): Get detailed model capabilities and parameters

ToolNode Tools (Operations):

update_generation_params(): Modify task card generation parameters
create_generation_job(): Submit new generation job to queue
cancel_job(): Cancel pending generation job
approve_asset(): Add generated result to project assets

Agent Architecture¶

ImageGenerationAgent Class:

Inherits from BaseAgent with generation-specific tools
Contains domain expertise for image generation models
Handles parameter optimization and model selection
Manages generation job lifecycle

Interaction Flow Example¶

User Input: "Generate photorealistic portrait"
Model Discovery: Agent queries MCP for suitable models
Parameter Update: Agent selects optimal model and parameters
Job Creation: Agent submits generation job with parameters
User Response: Agent confirms job creation and provides status

Specialized Knowledge and Model Discovery¶

MCP Server Architecture¶

Service Design:

Separate FastAPI service independent of agent services
Handles connections to external model databases
Implements caching layer for frequently accessed data
Provides REST API endpoints for data retrieval

Key Responsibilities:

External model database connectivity and querying
Data aggregation from multiple sources (project DB + external model DB)
Response caching to reduce external API calls
Filtering and search capabilities for model discovery

Endpoints:

/models/{task_type}: Get models by type with optional tag filtering
/models/search: Advanced model search with style matching
/project-memory/{project_id}: Aggregated project context
/card-memory/{card_id}: Task card context and job history

Integration Architecture¶

Agent ↔ MCP Server Flow:

Agent tool function makes HTTP request to MCP Server
MCP Server checks cache for requested data
If cache miss, MCP Server queries external databases
MCP Server returns structured data to agent tool
Agent processes data and continues workflow

Data Sources:

External Model Database: Contains AI model metadata, capabilities, tags
Main Application Database: Project, activity, card, and job data
Cache Layer: In-memory storage for frequently accessed model data

Generation Parameter Management¶

Parameter Lifecycle:

Default parameters set on card creation
User requests modify parameters through natural language
Agent validates and updates parameters via ToolNode
Parameters stored in task card generation_params JSONB field

Parameter Categories:

Model Selection: AI model ID and version
Generation Settings: Resolution, guidance scale, steps
Content Control: Prompts, negative prompts, style modifiers
Output Options: Number of images, format preferences

Validation and Defaults:

Each model type has specific parameter constraints
Agent validates parameters against model capabilities
Fallback to sensible defaults for missing parameters

Job Completion Handling¶

Completion Flow:

Notification Service receives job completion from cloud service
Service calls agent's job completion endpoint
Agent loads current conversation state
Agent formats completion message with result details
Agent updates conversation history and notifies UI

Completion Types:

Success: Include result URL and generation metadata
Failure: Include error details and suggested actions
Partial: Handle partially successful batch generations

State Management:

Agent maintains conversation continuity across job completion
Completion messages integrated into natural conversation flow
WebSocket notifications sent to keep UI synchronized

Common Agent Patterns¶

Error Handling¶

Error Categories:

Tool Execution Errors: MCP Server unavailable, database connection issues
Validation Errors: Invalid parameters, missing required data
External Service Errors: Cloud generation service failures
LLM Errors: Token limits, rate limiting, model unavailability

Error Recovery:

Graceful degradation when MCP Server is unavailable
Retry mechanisms for transient failures
User-friendly error messages in natural language
Structured error logging for debugging and monitoring

Error Context:

Preserve conversation state across error conditions
Include error details in agent response without exposing internals
Suggest alternative actions when possible

Agent Deployment Considerations¶

Service Architecture¶

AgentService Class:

FastAPI wrapper around LangGraph agent
Manages agent lifecycle and state persistence
Handles HTTP endpoints for chat and job completion
Integrates with PostgreSQL checkpointer for conversation persistence

Key Endpoints:

POST /chat/{project_id}: Process user messages
POST /job-completion/{card_id}: Handle job completion callbacks
GET /health: Service health and dependency status

State Persistence¶

LangGraph Integration:

PostgresSaver for conversation state persistence
Thread-based state isolation per user/project combination
Automatic state loading and saving across requests

Configuration:

Thread ID format: {project_id}:{user_id} for project agents
Thread ID format: {card_id}:{user_id} for specialized agents
State includes full conversation history and context

Health Monitoring¶

Dependency Checks:

Database connectivity for state persistence
MCP Server availability for data retrieval
Message broker connection for notifications
LLM service responsiveness

Health Status:

Individual component health reporting
Overall service health aggregation
Version and configuration information