Agents Concept and Design¶
TDD
The Xavier | Agents TDD, v.1.0 document describes a multi-agent system where a Project Agent oversees the workflow and specialized agents handle tasks like image generation. Tasks are organized Kanban-style, and the system runs on Python, FastAPI, PostgreSQL, and Docker. The initial MVP focuses on the core agent setup, basic project and task management, and a simple job queue, giving users a smooth, automated workflow without the usual technical headaches.
Agents are intelligent components designed to manage and orchestrate the digital product creation workflow.
Agent Design Overview¶
The system utilizes two types of intelligent agents, both fundamentally built on the langgraph framework="" for stateful conversation management. The core design principle is the Dual Tool Architecture, which clearly separates data retrieval from execution actions:
- Read Operations (Data Retrieval): Handled by the MCP Server (Model Context Protocol).
- Write Operations (Actions): Handled by the ToolNode (LangGraph Tools).
The entire process is LLM-driven, meaning the large language model within the agent decides when to fetch context, create new entities, or simply provide a response.
MCP Server vs ToolNode¶
MCP Server (Model Context Protocol)¶
Purpose: Provides centralized access to external data sources and context information.
Use Cases:
- Retrieving available AI models from remote database
- Fetching project memory (projects, activities, approved assets)
- Getting card memory (task card details, generation parameters)
- Accessing model metadata (capabilities, parameters, tags, styles)
Architecture:
- Separate service with its own database connections
- Implements caching for frequently accessed data
- Provides REST API for data retrieval
- Handles complex queries and data aggregation
Key Characteristics:
- Read-only operations
- External data access
- Caching layer
- Cross-agent shared resource
ToolNode (LangGraph Tools)¶
Purpose: Executes actions and operations within the agent workflow.
Use Cases:
- Creating new activities in projects
- Creating task cards within activities
- Updating generation parameters
- Creating generation jobs
- Approving assets
Architecture:
- Integrated into LangGraph agent workflow
- Direct database operations
- Stateful operations with transaction support
- Agent-specific implementations
Key Characteristics:
- Write operations
- Business logic execution
- Stateful actions
- Agent workflow integration
Usage Decision Matrix¶
| Operation Type | Use MCP Server | Use ToolNode |
|---|---|---|
| Get available models | ✅ | ❌ |
| Get project context | ✅ | ❌ |
| Create activity | ❌ | ✅ |
| Update parameters | ❌ | ✅ |
| Search models by tag | ✅ | ❌ |
| Create generation job | ❌ | ✅ |
LangGraph Implementation Patterns¶
Tool Functions Architecture¶
Tool Definition:
- Tools are implemented as Python functions with @tool decorator
- Each tool represents either MCP Server call or ToolNode operation
- Tools are automatically bound to LLM for selection and execution
MCP Server Tools:
- Make HTTP requests to MCP Server endpoints
- Handle data retrieval and context gathering
- Examples:
get_available_models(),search_models_by_style(),get_project_memory()
ToolNode Tools:
- Perform direct database operations
- Execute business logic and state changes
- Examples:
create_generation_job(),update_generation_params(),create_activity()
Core Agent Architecture¶
Agent State:
- messages: Conversation history
- project_id: Current project context
- user_id: User identification
Graph Structure:
- agent node: LLM processes messages and decides on tool usage
- tools node: Executes selected tools and returns results
- Conditional routing: Based on whether LLM makes tool calls
Flow Control:
- LLM analyzes user message
- If tools needed → route to tools node
- Execute tools and return results
- Continue until LLM provides final response
State Management Strategy¶
Conversation History Management:
- Maintain last 100 messages in agent state
- Use circular buffer pattern for efficient memory usage
- Persist state to PostgreSQL via LangGraph's built-in persistence
ConversationManager Architecture:
- Manages message buffer with configurable size limits
- Implements circular buffer to prevent memory growth
- Provides context extraction for LLM processing
- Integrates with LangGraph state management
Key Responsibilities:
- Message buffer maintenance and rotation
- Context formatting for LLM consumption
- State persistence and retrieval
- Memory usage optimization
Project Agent Specifications¶
Purpose¶
Manages high-level project orchestration, understands user intent, and creates activities/task cards.
Available Tools¶
MCP Server Tools (Data Retrieval):
get_project_memory(): Retrieve complete project context from MCP Serverget_user_history(): Get user's previous interactions and preferences
ToolNode Tools (Operations):
create_activity(): Create new activity with auto-generated positioncreate_task_card(): Create task card within specified activityreorder_activities(): Change activity positions in Kanban board
Agent Architecture¶
ProjectAgent Class:
- Inherits from BaseAgent with project-specific tools
- Manages high-level workflow orchestration
- Handles natural language intent analysis
- Creates project structure based on user requirements
Capabilities¶
-
Category Management:
- Analyze user requirements to determine category types
- Generate appropriate names and descriptions
- Set correct task types based on context
-
Task Card Creation:
- Break down categories into specific task cards
- Infer card details from project context
- Maintain relationships between cards
-
Workflow Suggestions:
- Propose next steps based on project state
- Identify missing components in workflow
- Suggest optimal task sequences
Prompt Engineering¶
ProjectAgentPrompts Architecture:
- System-level prompts define agent role and capabilities
- Context-aware prompt templates with dynamic variable injection
- Intent analysis prompts for user message understanding
- Task-specific prompt variations for different operations
Prompt Categories:
- System Prompts: Define agent identity and core responsibilities
- Intent Analysis: Extract user intent and mentioned entities
- Context Integration: Incorporate project memory and task types
- Action Planning: Guide decision-making for tool usage
Specialized Agent Specifications¶
Purpose¶
Handle domain-specific tasks within task cards, manage generation parameters, and create generation jobs.
Image Generation Agent Example¶
Available Tools¶
MCP Server Tools (Data Retrieval):
get_available_models(): Query external models database by task type and tagssearch_models_by_style(): Find models suitable for specific artistic stylesget_card_memory(): Retrieve current task card context and parametersget_model_details(): Get detailed model capabilities and parameters
ToolNode Tools (Operations):
update_generation_params(): Modify task card generation parameterscreate_generation_job(): Submit new generation job to queuecancel_job(): Cancel pending generation jobapprove_asset(): Add generated result to project assets
Agent Architecture¶
ImageGenerationAgent Class:
- Inherits from BaseAgent with generation-specific tools
- Contains domain expertise for image generation models
- Handles parameter optimization and model selection
- Manages generation job lifecycle
Interaction Flow Example¶
- User Input: "Generate photorealistic portrait"
- Model Discovery: Agent queries MCP for suitable models
- Parameter Update: Agent selects optimal model and parameters
- Job Creation: Agent submits generation job with parameters
- User Response: Agent confirms job creation and provides status
Specialized Knowledge and Model Discovery¶
MCP Server Architecture¶
Service Design:
- Separate FastAPI service independent of agent services
- Handles connections to external model databases
- Implements caching layer for frequently accessed data
- Provides REST API endpoints for data retrieval
Key Responsibilities:
- External model database connectivity and querying
- Data aggregation from multiple sources (project DB + external model DB)
- Response caching to reduce external API calls
- Filtering and search capabilities for model discovery
Endpoints:
/models/{task_type}: Get models by type with optional tag filtering/models/search: Advanced model search with style matching/project-memory/{project_id}: Aggregated project context/card-memory/{card_id}: Task card context and job history
Integration Architecture¶
Agent ↔ MCP Server Flow:
- Agent tool function makes HTTP request to MCP Server
- MCP Server checks cache for requested data
- If cache miss, MCP Server queries external databases
- MCP Server returns structured data to agent tool
- Agent processes data and continues workflow
Data Sources:
- External Model Database: Contains AI model metadata, capabilities, tags
- Main Application Database: Project, activity, card, and job data
- Cache Layer: In-memory storage for frequently accessed model data
Generation Parameter Management¶
Parameter Lifecycle:
- Default parameters set on card creation
- User requests modify parameters through natural language
- Agent validates and updates parameters via ToolNode
- Parameters stored in task card generation_params JSONB field
Parameter Categories:
- Model Selection: AI model ID and version
- Generation Settings: Resolution, guidance scale, steps
- Content Control: Prompts, negative prompts, style modifiers
- Output Options: Number of images, format preferences
Validation and Defaults:
- Each model type has specific parameter constraints
- Agent validates parameters against model capabilities
- Fallback to sensible defaults for missing parameters
Job Completion Handling¶
Completion Flow:
- Notification Service receives job completion from cloud service
- Service calls agent's job completion endpoint
- Agent loads current conversation state
- Agent formats completion message with result details
- Agent updates conversation history and notifies UI
Completion Types:
- Success: Include result URL and generation metadata
- Failure: Include error details and suggested actions
- Partial: Handle partially successful batch generations
State Management:
- Agent maintains conversation continuity across job completion
- Completion messages integrated into natural conversation flow
- WebSocket notifications sent to keep UI synchronized
Common Agent Patterns¶
Error Handling¶
Error Categories:
- Tool Execution Errors: MCP Server unavailable, database connection issues
- Validation Errors: Invalid parameters, missing required data
- External Service Errors: Cloud generation service failures
- LLM Errors: Token limits, rate limiting, model unavailability
Error Recovery:
- Graceful degradation when MCP Server is unavailable
- Retry mechanisms for transient failures
- User-friendly error messages in natural language
- Structured error logging for debugging and monitoring
Error Context:
- Preserve conversation state across error conditions
- Include error details in agent response without exposing internals
- Suggest alternative actions when possible
Agent Deployment Considerations¶
Service Architecture¶
AgentService Class:
- FastAPI wrapper around LangGraph agent
- Manages agent lifecycle and state persistence
- Handles HTTP endpoints for chat and job completion
- Integrates with PostgreSQL checkpointer for conversation persistence
Key Endpoints:
POST /chat/{project_id}: Process user messagesPOST /job-completion/{card_id}: Handle job completion callbacksGET /health: Service health and dependency status
State Persistence¶
LangGraph Integration:
- PostgresSaver for conversation state persistence
- Thread-based state isolation per user/project combination
- Automatic state loading and saving across requests
Configuration:
- Thread ID format:
{project_id}:{user_id}for project agents - Thread ID format:
{card_id}:{user_id}for specialized agents - State includes full conversation history and context
Health Monitoring¶
Dependency Checks:
- Database connectivity for state persistence
- MCP Server availability for data retrieval
- Message broker connection for notifications
- LLM service responsiveness
Health Status:
- Individual component health reporting
- Overall service health aggregation
- Version and configuration information