Skip to content

Flexprice for Billing

Overview

Flexprice is a modern billing platform designed for products where usage is the primary driver of cost. Instead of relying on static subscriptions or fixed pricing tiers, it allows you to charge customers based on what they actually consume.

In the context of AI systems, this becomes especially relevant. Different models use different billing units—tokens, images, seconds of video—and the pricing logic can quickly become complex. Flexprice provides a structured way to manage that complexity by separating usage tracking from pricing logic.

At its core, Flexprice acts as a rating engine. Your application sends raw usage data—such as tokens processed or seconds of video generated—and Flexprice transforms that into billable amounts using configurable pricing rules.


Purpose

The primary purpose of Flexprice is to move billing concerns out of application code and into a dedicated system.

In a typical AI platform, pricing logic often starts simple but grows quickly:

  • different models have different pricing strategies
  • providers update their pricing over time
  • enterprise customers require custom pricing
  • prepaid credits and subscriptions need to be supported

Embedding all of this directly in your backend leads to rigid and hard-to-maintain code.

Flexprice solves this by becoming the single source of truth for pricing and billing. Your application focuses on generating value—running AI models—while Flexprice determines how that value is monetized.

This separation makes it easier to:

  • update pricing without redeploying services
  • support multiple billing models at once
  • introduce plans, credits, and discounts
  • maintain consistency across all usage types

How Flexprice Can Be Used

Flexprice is primarily delivered as a cloud-based SaaS platform.

In this model, your system communicates with Flexprice over HTTP APIs. It does not run inside your infrastructure, and you do not manage its deployment. Instead, it acts as an external service that your backend integrates with.

A typical interaction looks like this:

  • your application executes an AI request
  • it collects usage data (tokens, duration, images)
  • it sends a usage event to Flexprice
  • Flexprice applies pricing rules and records the billable amount

This approach has several practical implications.

First, it simplifies your architecture. You do not need to maintain your own billing engine, pricing tables, or invoice logic. Flexprice provides these capabilities out of the box.

Second, it introduces a clear boundary. Your system becomes responsible for what happened, while Flexprice becomes responsible for what it costs.

Finally, it enables flexibility at scale. Because pricing rules are configured in Flexprice, you can adjust them without touching your generation workflows. This is particularly useful in AI systems where pricing strategies evolve rapidly.

While Flexprice is primarily SaaS, larger organizations may explore dedicated or private deployments depending on their requirements. However, for most use cases, the cloud model is the standard approach.


Estimated Pricing for a Small Model

Flexprice pricing is typically based on usage volume, especially the number of events processed. Each time your system reports usage—such as a generated image or a video request—it counts as a billing event.

For a small AI application, the scale is usually modest.

Consider a simple scenario:

  • 100 active users
  • each user generates around 10 assets per day
  • this results in roughly 30,000 generation events per month

Since your current system emits approximately one billing event per request, this translates directly into about 30,000 events per month.

At this level, Flexprice usage is generally low. Many billing platforms, including Flexprice, offer a free tier or low-cost entry level that comfortably supports this scale. In practice, the monthly cost is often negligible or falls within a small operational budget.

Even when usage grows, the cost of Flexprice remains relatively small compared to the underlying AI model costs. For example, generating images or videos typically incurs significantly higher provider costs than the billing infrastructure itself.

The important takeaway is that Flexprice cost scales with event volume, not with the price of the AI models. As long as your system maintains a clean design—ideally one billing event per request—the billing overhead remains predictable and manageable.

Practical End-to-end Workflow

sequenceDiagram autonumber participant U as User participant A as App Backend participant B as Flexprice participant P as AI Provider U->>A: Submit generation request A->>B: Check subscription and balance B-->>A: Entitlements A->>P: Execute AI request P-->>A: Return result A->>B: Emit usage event B-->>A: Rated usage

1. Provision the customer in Flexprice when the account is created

When a user, workspace, or company becomes billable in your app, create a matching customer in Flexprice and store both IDs in your database. Flexprice supports customer creation and also lookup by external ID, so the cleanest pattern is to keep your own stable customer ID as external_id in Flexprice. In your app, store something like:

{
  "workspace_id": "ws_123",
  "flexprice_customer_id": "cus_fp_abc",
  "flexprice_external_id": "ws_123"
}

This is the anchor for everything else. Every usage event, wallet lookup, and subscription later should resolve through this mapping.

2. Create or attach a subscription when the customer chooses a plan

When the customer starts a free, pro, or enterprise tier, create a Flexprice subscription for that customer. Flexprice has dedicated APIs for plans, prices, and subscriptions, so your app should not hardcode plan economics once Flexprice is live. Your app should only decide which commercial offering the customer is on, then let Flexprice own the billing structure.

At this point your app should save:

  • flexprice_plan_id
  • flexprice_subscription_id
  • current commercial state such as trialing, active, or cancel_at_period_end

Operationally, this means your AI gateway can ask, “is this workspace on a valid plan?” without having to recalculate plan rules itself.

3. Before generation, check entitlements and credit state

Before sending a request to OpenAI, Google, Kling, BytePlus, or another provider, your app should check whether the user is allowed to consume the requested resource. Flexprice exposes subscription entitlements and wallet-related APIs, including customer entitlements, subscription entitlements, wallet balances, and upcoming credit grant applications.

This is where your app decides things like:

  • Can this customer use Veo at all?
  • Can they use only standard quality, not pro?
  • Do they still have prepaid credits?
  • Are they allowed to generate 4K or long-duration video?

The clean pattern is:

  1. app receives generation request
  2. app reads its own auth and product context
  3. app queries cached Flexprice subscription or wallet state
  4. app allows, rejects, or downgrades the request

Do not block the live user path on repeated Flexprice round-trips if you can avoid it. Cache entitlements and wallet balance briefly in your app, then refresh asynchronously or on important state changes.

4. Execute the AI request in your app and collect raw usage

Your app should continue to call the model provider directly and capture the raw usage facts from the provider response or your own workflow layer. This is the most important design principle:

send raw usage to Flexprice, not only precomputed final cost.

For example:

  • DALL·E: image count, size, quality
  • GPT Image: input tokens, output tokens
  • Gemini image: input tokens, output tokens, fallback flag
  • Veo/Kling video: duration, resolution, mode, audio flag
  • Seedance: completion tokens if available, else estimated token quantity
  • Marble/world generation: request type, quality, count

Flexprice’s event ingestion model is designed exactly for this pattern: your app emits events with event_name, customer identity, timestamps, and arbitrary usage properties.

5. After the job completes, ingest a usage event into Flexprice

Once the AI generation finishes successfully, emit a billing event to Flexprice. If you may produce duplicate callbacks or retries, always generate a stable event ID on your side and send that along, so the event pipeline is idempotent from your app’s perspective. Flexprice has both single-event ingestion and bulk ingestion APIs, which gives you flexibility depending on latency needs.

A good event shape from your app looks like this:

{
  "event_name": "video_generated",
  "external_customer_id": "ws_123",
  "event_id": "gen_789_final",
  "source": "ai-gateway",
  "timestamp": "2026-04-15T10:15:00Z",
  "properties": {
    "provider": "google-vertex",
    "model": "veo-3.1-lite-generate-001",
    "duration_seconds": 8,
    "resolution": "1080p",
    "quality": "standard",
    "generate_audio": true,
    "job_id": "job_789"
  }
}

For token-based image generation, the event should carry token counts rather than only dollar cost.

6. Let Flexprice calculate billable usage from prices and filters

Your pricing logic should live in Flexprice prices, meters, filters, and plan-level configuration, not inside your generation workflow. Flexprice’s pricing API supports usage pricing, meter-linked pricing, price units, tiers, transform rules, and filter values, which is exactly what you need for model families where price depends on model, quality, resolution, duration, or mode.

So instead of doing this as your main truth:

cost = ceil(duration) * base * resolution_multiplier * model_multiplier

your app should do this:

collect usage -> emit usage event -> Flexprice applies pricing

This is the biggest architectural improvement. It lets product or finance teams change pricing without changing generation code.

7. Keep a local billing ledger for observability, but not as the source of truth

Even after adopting Flexprice, your app should still write a local internal billing record for each generation request. This should include:

  • app request ID
  • provider request ID
  • flexprice event ID
  • workspace/customer ID
  • model
  • raw usage facts
  • estimated local cost, if you still compute one
  • Flexprice-rated amount once available

This gives you reconciliation, support tooling, and auditability. Flexprice is your billing authority; your local ledger is your operational trace.

8. Preview the charge before committing high-cost actions

For expensive models such as long-form video or world generation, your app should preview cost before submitting the provider job. Flexprice exposes invoice preview endpoints, including preview based on meter usage, which makes it possible to estimate what a request would cost before you actually ingest the final event.

This is useful for flows like:

  • “This Veo request is estimated to cost €3.84. Continue?”
  • “You do not have enough credits for this request.”
  • “This request exceeds your plan limit.”

For this pattern, your app builds a hypothetical usage payload from request parameters, calls preview, then decides whether to proceed.

9. Use wallets for prepaid credits, but keep them separate from usage events

If your commercial model includes prepaid credits, monthly grants, or top-ups, manage that through Flexprice wallets rather than embedding credit arithmetic in the AI workflow. Flexprice has wallet creation, balance, transaction, top-up, termination, and alert-related APIs and events.

A strong pattern is:

  • usage events always describe consumption
  • wallet logic handles how that consumption is paid for
  • invoices handle any overage beyond wallet or commitment balance

That separation keeps your system understandable.

10. Listen to Flexprice webhook events and reflect them back into your app

Your app should subscribe to Flexprice webhook events for subscription and wallet lifecycle changes. The API reference exposes webhook event schemas such as subscription.created, subscription.activated, subscription.updated, subscription.cancelled, wallet.created, wallet.transaction.created, and credit-balance alert events.

Those webhooks should update your app state, for example:

  • subscription activated → unlock paid models
  • subscription paused → block new generation
  • wallet credit dropped → warn user in UI
  • invoice finalized → store invoice reference in your billing portal
  • payment failed → downgrade after grace rules

This prevents the billing state from drifting between systems.

11. Sync invoices and payments through your payment provider

If you collect card payments through Stripe or another PSP, Flexprice can sit in the middle as the billing brain while the payment provider handles payment execution. Flexprice’s docs describe invoice sync and payment workflows, including integration-driven invoice finalization and customer sync behavior.

A clean commercial flow is:

  1. app emits usage to Flexprice
  2. Flexprice aggregates usage into invoiceable amounts
  3. Flexprice finalizes or syncs invoice
  4. payment provider charges the customer
  5. payment or invoice status flows back into your app through webhooks

12. Reconcile regularly

Even with a good design, AI billing is noisy. Providers may retry callbacks, return delayed usage, or expose incomplete metrics for some models. So your app should run a reconciliation job that compares:

  • provider job records
  • app generation records
  • Flexprice raw events
  • Flexprice usage by subscription or meter
  • invoice totals

Flexprice provides usage analytics, raw event listing, usage by meter, usage by subscription, and invoice retrieval APIs that support this reconciliation loop.

The workflow in one simple sequence

A good production sequence looks like this:

User submits AI request
-> App authenticates and identifies workspace/customer
-> App checks cached entitlements and wallet state from Flexprice
-> App optionally previews price for expensive requests
-> App sends request to AI provider
-> App stores generation job and raw provider metadata
-> App emits final usage event to Flexprice
-> Flexprice applies prices and updates wallet/invoice state
-> Flexprice webhooks notify your app about subscription, wallet, and invoice changes
-> App updates UI, billing history, and access controls

What should stay in your app

Your app should own:

  • identity and auth
  • request validation
  • model routing
  • provider execution
  • raw usage capture
  • generation/job history
  • UX decisions such as warnings, gating, and retries

What should move to Flexprice

Flexprice should own:

  • pricing tables
  • usage rating
  • credits or wallet balances
  • subscriptions and entitlements
  • invoice generation
  • billing analytics and payment-state-driven lifecycle handling