Skip to content

AI Models Cost Tracking

Architecture and event pipeline for tracking AI model execution costs and enforcing usage quotas.


Goals

System Goals

  • Capture all AI model execution costs (frontend and backend workers)
  • Maintain a single source of truth for analytics in ClickHouse
  • Allow Policy Vault to compute quotas using the same cost events
  • Ensure full consistency between:
    • UI analytics
    • quota enforcement

High-Level Architecture

--- config: layout: elk --- flowchart TD A[ML Agents] B[BG Workers] C[Cost Event Producer] D[Kafka] E[Kafka Topic: ml-agents<br/>raw cost topic] F[ClickHouse<br/>telemetry.ml-agents table<br/>Aggregates, validates & processes cost events] G[Policy Vault Service<br/>Consumes same Kafka topic in real-time<br/>Updates quotas] H[Kafka Topic: ml-agents-broken-msg<br/>invalid messages] A --> C B --> C C --> D D --> E E --> F E --> G F --> H

Event Flow

Detailed Flow Explanation

1. Event Production

  • ML Agents and backend workers emit cost events
  • Events are published to the Kafka topic:

    ml-agents

2. ClickHouse Processing

ClickHouse consumes the topic and:

  • stores valid events
  • marks invalid events with

    event_type = "data.import.error"

3. Policy Vault Processing

Policy Vault consumes the same Kafka topic in parallel:

  • validates message schema
  • updates user/project/company quotas
  • pushes malformed events to

    ml-agents-broken-msg

4. UI

The UI reads:

  • analytics → ClickHouse
  • quota states → Policy Vault

Event Sources

Where cost events originate

Cost events are emitted when:

  • User triggers AI generation
  • Token usage is returned by APIs
  • Temporal workflows complete
  • Batch processing finishes
  • Retries or background inference jobs finish

All events are published to:

ml-agents

Cost Event Schema

All producers must emit normalized, idempotent events.


Event JSON Example

{
  "key": "uuid",
  "timestamp": "ISO 8601 datetime",
  "tenantId": "string",
  "userId": "uuid",
  "projectId": "string",
  "modelId": "string",
  "operation": "string",
  "currency": "USD",
  "amount": 0.0025,
  "tokensIn": 1200,
  "tokensOut": 450,
  "version": 1
}

Schema Fields

Field Type Description
key uuid Random generated unique event ID used for idempotency
timestamp timestamp Event time (ISO-8601 UTC)
tenantId string Tenant identifier (default xavier)
userId uuid Current user identifier
projectId string Project identifier (Pocketbase: e.g. 6dcs9qlfuamw5hq)
modelId string Model identifier (Pocketbase: e.g. 0w5bnpvx3nnb730)
operation string Operation name
currency string 3-letter currency code (USD)
amount number Final calculated cost
tokensIn number Input token count
tokensOut number Output token count
version number Schema version (default 1)

ClickHouse – Source of Truth

Table

telemetry.ml-agents
Purpose:

  • Analytics
  • Reporting
  • UI costs dashboards
  • Data consistency anchor

Invalid messages handling:

  • Stored in the table with event_type = "data.import.error"
  • Does not push to a separate Kafka topic

Why ClickHouse is the source of truth

  • Append-only immutable events
  • Supports aggregations by:

    • User
    • Project
    • Model
    • Card/Shot (if necessary)
    • Period (daily, monthly, etc.)
  • High-performance analytical queries

This table is already used by UI to display:

  • Current expenditure
  • Period-based cost
  • Model usage breakdown

Publishing to ai.execution.cost

Policy Vault subscribes to the ml-agents Kafka topic simultaneously with ClickHouse.


Processing Steps

For each event:

  1. Validate schema
  2. Aggregate spending by:

    • user
    • project
    • model
    • Update internal quota state

Broken Message Handling

Invalid events are forwarded to:

ml-agents-broken-msg

Quota Calculation

remaining_quota = assigned_quota - total_spent

The service exposes via API:

  • remaining budget
  • percentage used
  • threshold alerts

Data Consistency Strategy

Design Principle

ClickHouse = financial ledger

Policy Vault = real-time quota calculator


Idempotency

Exactly-once semantics
  • Unique event UUID
  • Deduplication in Policy Vault
  • Kafka message key = workflow run ID

Parallel Consumption

Policy Vault consumes in parallel with ClickHouse, avoiding:

  • dependency on ClickHouse availability
  • latency in quota updates

Replay Capability

If needed:

  1. Reset Kafka consumer offset
  2. Replay ml-agents
  3. Rebuild quotas deterministically from immutable events

UI Data Flow

Models Cost Dashboard

The dashboard reads data from ClickHouse through the Telemetry service.

Gateway API endpoint:

/stats

The UI displays:

  • cost by period
  • cost by project
  • cost by model
  • cost by user

Architecture Benefits

  • Single financial source of truth
  • Event-driven architecture
  • Highly scalable
  • Supports replay and deterministic rebuild
  • Clear separation of responsibilities

    Component Responsibility
    ClickHouse Analytics
    Policy Vault Quota enforcement
    Kafka Event backbone
  • Broken messages are isolated safely.


Summary

  • All AI execution costs are emitted from ML Agents & backend workers
  • Kafka acts as central event transport
  • ClickHouse stores immutable financial ledger, marks invalid events with event_type = "data.import.error"
  • Policy Vault consumes simultaneously, updates quotas, and pushes only broken messages to ml-agents-broken-msg
  • UI reflects expenditure from ClickHouse and remaining quota from Policy Vault

Result

Consistent financial analytics and quota enforcement across the AI platform.