AI Models Cost Tracking¶
Architecture and event pipeline for tracking AI model execution costs and enforcing usage quotas.
Goals¶
System Goals
- Capture all AI model execution costs (frontend and backend workers)
- Maintain a single source of truth for analytics in ClickHouse
- Allow Policy Vault to compute quotas using the same cost events
- Ensure full consistency between:
- UI analytics
- quota enforcement
High-Level Architecture¶
Event Flow¶
Detailed Flow Explanation
1. Event Production
- ML Agents and backend workers emit cost events
-
Events are published to the Kafka topic:
ml-agents
2. ClickHouse Processing
ClickHouse consumes the topic and:
- stores valid events
-
marks invalid events with
event_type = "data.import.error"
3. Policy Vault Processing
Policy Vault consumes the same Kafka topic in parallel:
- validates message schema
- updates user/project/company quotas
-
pushes malformed events to
ml-agents-broken-msg
4. UI
The UI reads:
- analytics → ClickHouse
- quota states → Policy Vault
Event Sources¶
Where cost events originate
Cost events are emitted when:
- User triggers AI generation
- Token usage is returned by APIs
- Temporal workflows complete
- Batch processing finishes
- Retries or background inference jobs finish
All events are published to:
Cost Event Schema¶
All producers must emit normalized, idempotent events.
Event JSON Example¶
{
"key": "uuid",
"timestamp": "ISO 8601 datetime",
"tenantId": "string",
"userId": "uuid",
"projectId": "string",
"modelId": "string",
"operation": "string",
"currency": "USD",
"amount": 0.0025,
"tokensIn": 1200,
"tokensOut": 450,
"version": 1
}
Schema Fields¶
| Field | Type | Description |
|---|---|---|
| key | uuid | Random generated unique event ID used for idempotency |
| timestamp | timestamp | Event time (ISO-8601 UTC) |
| tenantId | string | Tenant identifier (default xavier) |
| userId | uuid | Current user identifier |
| projectId | string | Project identifier (Pocketbase: e.g. 6dcs9qlfuamw5hq) |
| modelId | string | Model identifier (Pocketbase: e.g. 0w5bnpvx3nnb730) |
| operation | string | Operation name |
| currency | string | 3-letter currency code (USD) |
| amount | number | Final calculated cost |
| tokensIn | number | Input token count |
| tokensOut | number | Output token count |
| version | number | Schema version (default 1) |
ClickHouse – Source of Truth¶
Table
Purpose:- Analytics
- Reporting
- UI costs dashboards
- Data consistency anchor
Invalid messages handling:
- Stored in the table with
event_type = "data.import.error" - Does not push to a separate Kafka topic
Why ClickHouse is the source of truth¶
- Append-only immutable events
-
Supports aggregations by:
- User
- Project
- Model
- Card/Shot (if necessary)
- Period (daily, monthly, etc.)
-
High-performance analytical queries
This table is already used by UI to display:
- Current expenditure
- Period-based cost
- Model usage breakdown
Publishing to ai.execution.cost¶
Policy Vault subscribes to the ml-agents Kafka topic simultaneously with ClickHouse.
Processing Steps¶
For each event:
- Validate schema
-
Aggregate spending by:
- user
- project
- model
- Update internal quota state
Broken Message Handling¶
Invalid events are forwarded to:
Quota Calculation¶
The service exposes via API:
- remaining budget
- percentage used
- threshold alerts
Data Consistency Strategy¶
Design Principle
ClickHouse = financial ledger
Policy Vault = real-time quota calculator
Idempotency¶
Exactly-once semantics
- Unique event UUID
- Deduplication in Policy Vault
- Kafka message key = workflow run ID
Parallel Consumption¶
Policy Vault consumes in parallel with ClickHouse, avoiding:
- dependency on ClickHouse availability
- latency in quota updates
Replay Capability¶
If needed:
- Reset Kafka consumer offset
- Replay
ml-agents - Rebuild quotas deterministically from immutable events
UI Data Flow¶
Models Cost Dashboard¶
The dashboard reads data from ClickHouse through the Telemetry service.
Gateway API endpoint:
The UI displays:
- cost by period
- cost by project
- cost by model
- cost by user
Architecture Benefits¶
- Single financial source of truth
- Event-driven architecture
- Highly scalable
- Supports replay and deterministic rebuild
-
Clear separation of responsibilities
Component Responsibility ClickHouse Analytics Policy Vault Quota enforcement Kafka Event backbone -
Broken messages are isolated safely.
Summary¶
- All AI execution costs are emitted from ML Agents & backend workers
- Kafka acts as central event transport
- ClickHouse stores immutable financial ledger, marks invalid events with
event_type = "data.import.error" - Policy Vault consumes simultaneously, updates quotas, and pushes only broken messages to
ml-agents-broken-msg - UI reflects expenditure from ClickHouse and remaining quota from Policy Vault
Result
Consistent financial analytics and quota enforcement across the AI platform.