Skip to content

How MCP Servers Interact with Agents

At its core, the interaction between an AI agent and an MCP server is a clean separation of responsibilities.

  • Agents are responsible for thinking: understanding intent, planning steps, and deciding what should happen next.
  • MCP servers are responsible for doing: executing actions, managing state, persisting data, and integrating with real systems.

They communicate through a strict, well-defined protocol rather than ad-hoc function calls or direct system access.


The Big Picture

From the agent’s point of view, an MCP server is a capability provider.

From the system’s point of view, it is a safety and execution boundary.

A typical interaction looks like this:

User → Agent → MCP Client → MCP Server → External Systems

The agent never talks directly to databases, workflow engines, or generation services. Everything flows through MCP servers.


Step-by-Step Interaction Flow

1. The Agent Receives a Goal

The interaction begins when the agent receives a user request, such as:

“Create a cinematic character image and store it in the project.”

At this stage, the agent is only reasoning. It:

  • Interprets the user’s intent
  • Breaks the request into logical steps
  • Identifies which capabilities are needed

No execution happens yet.


2. Capability Discovery

When the agent connects to an MCP server, it first asks:

“What can you do?”

This is done via a tool discovery request. The server responds with:

  • A list of available tools
  • Descriptions of what each tool does
  • Input schemas and constraints
  • Safety hints (read-only, destructive, idempotent)

This step is critical: it defines the agent’s allowed action space.


3. Tool Selection

Based on its plan, the agent selects the appropriate tool.

Examples:

  • Choosing a generation model → Models MCP Server
  • Loading or updating project memory → Project Structure MCP Server
  • Running a long, multi-step process → Temporal-backed MCP Server

The agent does not guess or improvise. It chooses only from the tools the server explicitly exposes.


4. Tool Invocation

To perform an action, the agent calls a tool using a structured request that conforms to the tool’s schema.

Important characteristics:

  • Arguments are validated by the server
  • Only declared tools can be invoked
  • No direct database queries or HTTP calls from the agent

This makes the interaction predictable and safe.


5. Server-Side Execution

Once a tool is invoked, responsibility shifts entirely to the MCP server.

The server may:

  • Read or write persistent data
  • Start or manage a workflow (e.g., via Temporal)
  • Call external APIs or generation models
  • Apply retries, timeouts, and error handling
  • Enforce business rules and permissions

From the agent’s perspective, all of this is implementation detail.


6. Returning Results

The MCP server returns a structured response that may include:

  • Results or generated artifacts
  • Identifiers (task IDs, project IDs)
  • Status or progress information
  • Metadata useful for next steps

The response is designed to be easy for the agent to reason about, not to expose internal complexity.


7. Agent Continues the Loop

With the result in hand, the agent:

  • Updates its internal plan
  • Decides whether additional tools are needed
  • Responds to the user or continues execution

This request–reason–act loop continues until the goal is complete.


Stateless Agents, Stateful Servers

This interaction model is intentional.

Agents

  • Are ephemeral and restartable
  • Focus on reasoning and decision-making
  • Hold short-lived conversational context

MCP Servers

  • Are persistent and reliable
  • Own long-term state and memory
  • Handle side effects and durability

This design allows agents to fail or restart without losing system integrity.


Interaction Patterns by Server Type

  • Models Servers provide information and comparisons; agents choose, servers do not decide for them.
  • Project & Memory Servers act as the system’s long-term memory and structural backbone.
  • StoryCraft Servers enforce structured, field-level updates with validation and merge semantics.
  • Temporal-backed Servers ensure long-running processes survive failures and retries.

Each server exposes a narrow, well-defined surface area that agents can safely use.

Below is the same technical explanation, extended with concrete interaction examples. The examples are intentionally low-level and protocol-oriented, suitable for architecture docs or developer reference material.

Interaction Examples

Example 1: Tool Discovery

Scenario

An agent connects to an MCP server and needs to understand what capabilities are available.

Interaction

Agent → MCP Server

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/list"
}

MCP Server → Agent

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "tools": [
      {
        "name": "find_models_by_criteria",
        "description": "Find AI models matching cost, type, and resolution constraints",
        "inputSchema": {
          "type": "object",
          "properties": {
            "model_type": { "type": "string" },
            "max_cost": { "type": "number" }
          }
        }
      }
    ]
  }
}

Outcome

The agent now has a bounded, explicit set of actions it is allowed to perform against this server.


Example 2: Model Selection via Models MCP Server

Scenario

The agent needs to select a cost-effective image generation model.

Interaction

Agent → MCP Server

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tools/call",
  "params": {
    "name": "find_models_by_criteria",
    "arguments": {
      "model_type": "image",
      "max_cost": 0.05
    }
  }
}

MCP Server → Agent

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "models": [
      {
        "id": "gpt-image-1",
        "provider": "openai",
        "cost": 0.02,
        "max_resolution": "1536x1024"
      }
    ]
  }
}

Outcome

The agent reasons over the response and decides which model to use. The server provides data but does not make the decision.


Example 3: Loading Project Context

Scenario

An agent needs full project context before generating assets.

Interaction

Agent → Project Structure MCP Server

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "get_full_project_info",
    "arguments": {
      "project_id": "proj_123"
    }
  }
}

MCP Server → Agent

{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "project_id": "proj_123",
    "memory": "Cinematic style, dramatic lighting",
    "categories": [...],
    "sequences": [...],
    "shots": [...],
    "cards": [...],
    "context_loaded": true
  }
}

Outcome

The agent now has durable, structured context without managing storage or persistence itself.


Example 4: Creating a Generation Card

Scenario

The agent wants to create a new image generation task within a project.

Interaction

Agent → MCP Server

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "tools/call",
  "params": {
    "name": "create_card",
    "arguments": {
      "project_id": "proj_123",
      "category_id": "cat_characters",
      "name": "Hero Concept Art",
      "task_type": "image"
    }
  }
}

MCP Server → Agent

{
  "jsonrpc": "2.0",
  "id": 4,
  "result": {
    "card_id": "card_789",
    "status": "created"
  }
}

Outcome

The server handles persistence and validation. The agent receives a reference it can use for subsequent steps.


Example 5: Updating Structured Narrative Data (StoryCraft)

Scenario

The user asks to update a single field in a story bible.

Interaction

Agent → StoryCraft MCP Server

{
  "jsonrpc": "2.0",
  "id": 5,
  "method": "tools/call",
  "params": {
    "name": "update_entity_field",
    "arguments": {
      "entity_id": "story_001",
      "template_id": "sct_bible",
      "field_key": "world",
      "value": "A post-apocalyptic Earth reclaimed by nature"
    }
  }
}

MCP Server → Agent

{
  "jsonrpc": "2.0",
  "id": 5,
  "result": {
    "success": true,
    "field_key": "world",
    "message": "Field updated successfully"
  }
}

Outcome

Only the specified field is updated. All other template data remains intact.


Example 6: Long-Running Workflow via Temporal-Backed MCP Server

Scenario

The agent initiates a multi-step generation workflow that may take minutes or hours.

Interaction

Agent → MCP Server

{
  "jsonrpc": "2.0",
  "id": 6,
  "method": "tools/call",
  "params": {
    "name": "start_generation_workflow",
    "arguments": {
      "project_id": "proj_123",
      "card_id": "card_789"
    }
  }
}

MCP Server → Agent

{
  "jsonrpc": "2.0",
  "id": 6,
  "result": {
    "workflow_id": "wf_456",
    "status": "started"
  }
}

Later, the agent may query status:

{
  "jsonrpc": "2.0",
  "id": 7,
  "method": "tools/call",
  "params": {
    "name": "get_workflow_status",
    "arguments": {
      "workflow_id": "wf_456"
    }
  }
}

Outcome

The workflow executes durably in Temporal. The agent remains stateless and restart-safe.