Backend Workflow Deduced from Aspire AppHost¶
The main flow can be deduced from the Aspire project used for app deployment.
The AppHost defines a complete local (and potentially dev) deployment of the Xavier backend: it starts the infrastructure layer (datastores, brokers, workflow engine), then starts the service layer (APIs and background workers) with explicit dependencies wired in.
Infrastructure Layer¶
The infrastructure layer provides storage, messaging, caching, analytics, and workflow orchestration.
PostgreSQL is used as the main transactional database engine. Multiple logical databases are created, each dedicated to a specific domain such as tariffs, policy access control (PBAC), assets, model registry, and email storage. This separation ensures that services do not share schemas and can evolve independently.
Redis is deployed as a fast in-memory cache. It supports low-latency operations and temporary state, helping services respond quickly without always querying primary databases.
Kafka and NATS provide messaging capabilities. Kafka is used for high-throughput event streaming and reliable event delivery. It supports event-driven workflows across services. NATS is used for lightweight and fast publish–subscribe messaging, typically for simpler communication scenarios.
Cassandra serves as a distributed data store designed for scalable workloads that require horizontal scaling.
ClickHouse is deployed as an analytical database. It is optimized for processing large volumes of telemetry and usage data, enabling fast aggregations and reporting.
Temporal is the workflow orchestration engine. It manages long-running, multi-step processes and ensures they complete reliably, even if services restart or fail during execution.
Together, these components form the technical backbone of the system.
Application Services¶
On top of the infrastructure, several business services are deployed.
Gateway API¶
The Gateway API is the main entry point for external clients. It routes requests to internal services, interacts with Cassandra for distributed data access, communicates with the Model Registry, and publishes or consumes events through Kafka. Redis may be used to improve performance through caching.
Quotas Service¶
The Quotas Web API evaluates system usage and enforces limits. It reads telemetry data from ClickHouse, uses tariff definitions stored in PostgreSQL, and may rely on Redis and NATS for fast interactions. Its purpose is to ensure that clients operate within allowed usage boundaries.
Telemetry Service¶
The Telemetry service consumes events from Kafka and stores processed data in ClickHouse. It may enrich this data using tariff information from PostgreSQL. This service forms the bridge between raw system events and analytical insights.
PBAC (Policy-Based Access Control)¶
The PBAC service manages access policies and authorization decisions. It has its own PostgreSQL database and integrates with the messaging layer. An administrative interface allows policies to be created and maintained separately from runtime decision logic.
Assets Service¶
The Assets service manages asset metadata stored in PostgreSQL. It uses Temporal to coordinate background processes, and a dedicated worker executes asynchronous tasks related to asset processing.
Model Registry¶
The Model Registry subsystem includes a main service, a webhook component, and a worker. It uses PostgreSQL for metadata, Kafka for event communication, and Temporal for orchestrating model lifecycle workflows such as registration and updates.
Supporting Services¶
Additional services include a Workflows worker that executes Temporal-driven processes, an Email Sender service with its own database, development tools, and a bot that seeds telemetry-related data for testing or initialization.
Core Workflows¶
The system operates through a combination of synchronous request handling, event-driven communication, and durable workflow orchestration.
Client Request Workflow¶
- A client sends a request to the Gateway API.
- The Gateway may validate access through the PBAC service.
- The request is routed to the appropriate internal service.
- Data is read from or written to PostgreSQL, Cassandra, or Redis as needed.
- If the action generates domain events, these are published to Kafka.
This workflow handles standard API interactions and forms the main runtime path for external communication.
Telemetry and Quota Evaluation Workflow¶
- Services emit telemetry or usage events to Kafka.
- The Telemetry service consumes these events and stores them in ClickHouse.
- The Quotas service reads aggregated telemetry data from ClickHouse.
- Tariff definitions from PostgreSQL are applied to evaluate limits.
- If usage exceeds thresholds, enforcement logic is triggered.
This workflow enables near real-time monitoring and control of system usage.
Model Registration Workflow¶
- A new model is registered through the Gateway or webhook.
- The Model Registry service stores metadata in PostgreSQL.
- An event is published to Kafka to signal that a new model is available.
- A Temporal workflow is started to manage the registration lifecycle.
- The workflow may perform validation, trigger background tasks, update analytical records, and publish additional events.
Temporal ensures that each step completes reliably. If a failure occurs, the workflow can retry or resume from the last consistent state.
Asset Processing Workflow¶
- An asset is uploaded or updated.
- The Assets service stores metadata in PostgreSQL.
- A Temporal workflow is initiated.
- A worker executes background processing steps, which may involve multiple retries.
- Completion events are published if required.
This design ensures reliable and traceable asset handling.