Billing Reconciliation¶

1) How to access detailed vendor billing for the models in this project¶

OpenAI¶

OpenAI’s supported source of detailed spend and usage is the API Usage Dashboard, where you can export both Activity data and Cost data as CSV. The dashboard supports project-level breakdowns, and OpenAI notes that projects can be used to track usage and budgets per project. OpenAI’s API responses also include a usage object for request-level token accounting, which is essential for your internal meter even if vendor billing is reconciled from exported cost files. Starting with invoices issued on April 1, 2026, detailed API costs are no longer provided on Enterprise invoices and OpenAI points customers to the export flow instead. (OpenAI Help Center)

What this means for you: Use OpenAI request responses for internal per-request metering, and use daily or monthly Usage Dashboard CSV exports as vendor truth for cost reconciliation. OpenAI is good for cost/activity exports, but your implementation should not depend on invoice PDFs for detail anymore. (OpenAI Help Center)

Gemini on Vertex AI¶

If Gemini runs through Vertex AI, this is the cleanest reconciliation path. Google supports Cloud Billing export to BigQuery, including a detailed usage cost export. Vertex AI also supports custom metadata labels on generateContent and streamGenerateContent calls, and those labels are forwarded into billing so costs can be filtered and grouped by label. In addition, Vertex AI supports request-response logging to BigQuery for Gemini and supported partner models. (Google Cloud Documentation)

What this means for you: If you want robust reconciliation, route paid Gemini production traffic through Vertex AI, attach your own request labels, and treat BigQuery billing export as the vendor ledger. This is by far the strongest option. (Google Cloud Documentation)

Note

However, labels are not request-level and you do not expect them to show up in billing attached to a certain request.

Gemini via Google AI Studio / Gemini API¶

If you use Gemini directly via the Gemini API, Google documents usage_metadata on responses, including prompt, output, cached, thought, and total token counts, and AI Studio provides a logs page for supported API calls on billing-enabled projects. Google also states that developer-owned logs are available for supported Gemini API calls when billing is enabled. (Google AI for Developers)

What this means for you: This is usable for internal usage metering, but not as strong as Vertex AI for accounting-grade reconciliation. For Gemini API direct, your vendor reconciliation should rely mostly on captured usage_metadata + AI Studio logs + monthly billing totals, not on a rich line-item billing export like Cloud Billing BigQuery. (Google AI for Developers)

Kling¶

Kling’s public docs clearly expose credit-based pricing schedules for major video models, such as Kling VIDEO 3.0 and O1, and the API docs expose task query endpoints. From the official material I found, Kling documents pricing rules and task APIs, but there is no official detailed billing export equivalent to Cloud Billing BigQuery export. So for Kling, the safe assumption is that reconciliation must be driven by your own task ledger plus model credit rules, with vendor portal totals used as a periodic check. (Kling AI)

What this means for you: Treat Kling as a credit-rule reconciliation vendor. Your system should persist every request’s pricing inputs, compute expected credits internally, then compare them with returned task metadata and monthly account totals from the vendor console. (Kling AI)

2) The recommended implementation¶

Use a three-ledger model:

Internal usage ledger — what your app believes happened per request
Vendor usage ledger — what the vendor reports per exported row, task, or billing record
Invoice ledger — monthly invoice or statement totals

Do not overwrite one with another. Reconciliation is the act of comparing them, not merging them.

Core rule¶

Every request gets a stable recon_key that survives retries and vendor-specific IDs:

recon_key = sha256(environment + tenant_id + request_id + model + started_at_utc)

For Vertex AI, also attach this as a billing label where possible. For OpenAI and Gemini API direct, keep it in your app logs. For Kling, keep it alongside the task id.

3) Concrete tables¶

Below is a practical PostgreSQL schema.

create table vendor_account (
  id bigserial primary key,
  vendor text not null,                       -- openai, google_vertex, google_ai_studio, kling
  account_name text not null,
  environment text not null,                  -- prod, stage
  billing_scope text,                         -- org/project/billing_account/workspace
  currency_code text not null default 'USD',
  is_active boolean not null default true,
  created_at timestamptz not null default now()
);

create table ai_request_fact (
  id bigserial primary key,
  recon_key text not null unique,
  request_id text not null,
  vendor_request_id text,                     -- completion id / task id / provider request id
  vendor_account_id bigint not null references vendor_account(id),

  environment text not null,
  tenant_id text,
  team_id text,
  user_id text,

  vendor text not null,
  provider text not null,                     -- openai/google/kling
  model text not null,
  operation text not null,                    -- text_generation/image_generation/video_generation/etc
  region text,
  service_tier text,
  status text not null,                       -- succeeded/failed/cancelled

  started_at timestamptz not null,
  finished_at timestamptz,
  request_date date generated always as ((started_at at time zone 'UTC')::date) stored,

  internal_input_tokens bigint,
  internal_output_tokens bigint,
  internal_thinking_tokens bigint,
  internal_cached_tokens bigint,

  internal_images integer,
  internal_video_seconds numeric(10,2),
  internal_audio_seconds numeric(10,2),
  internal_credits numeric(18,6),

  internal_usage jsonb,                       -- raw adapter usage details
  internal_estimated_cost numeric(18,6),
  currency_code text not null default 'USD',

  created_at timestamptz not null default now()
);

create index idx_ai_request_fact_day_vendor_model
  on ai_request_fact(request_date, vendor, model);

create table vendor_usage_raw (
  id bigserial primary key,
  vendor_account_id bigint not null references vendor_account(id),
  vendor text not null,
  source_type text not null,                  -- openai_cost_csv, gcp_bq_detailed_export, kling_task_api, invoice_csv
  source_ref text,                            -- filename, gs://..., bq table, api batch id
  imported_at timestamptz not null default now(),
  payload jsonb not null
);

create table vendor_usage_line (
  id bigserial primary key,
  vendor_account_id bigint not null references vendor_account(id),
  vendor text not null,
  source_type text not null,
  source_ref text,

  usage_start timestamptz not null,
  usage_end timestamptz,
  usage_date date generated always as ((usage_start at time zone 'UTC')::date) stored,

  vendor_request_id text,
  recon_key text,                             -- populated when available
  billing_project text,
  billing_label_recon_key text,
  billing_label_tenant_id text,
  billing_label_team_id text,

  model text,
  operation text,
  sku text,
  metric_name text,                           -- input_tokens, output_tokens, credits, seconds
  metric_value numeric(18,6),
  cost_amount numeric(18,6),
  currency_code text,

  raw_usage jsonb,
  created_at timestamptz not null default now()
);

create index idx_vendor_usage_line_day_vendor_model
  on vendor_usage_line(usage_date, vendor, model);

create table vendor_invoice (
  id bigserial primary key,
  vendor_account_id bigint not null references vendor_account(id),
  vendor text not null,
  invoice_number text,
  invoice_month date not null,                -- store first day of month
  invoice_date date,
  currency_code text not null,
  invoice_total numeric(18,6) not null,
  tax_amount numeric(18,6),
  credits_amount numeric(18,6),
  raw_invoice jsonb,
  unique (vendor_account_id, vendor, invoice_month, invoice_number)
);

create table vendor_invoice_line (
  id bigserial primary key,
  vendor_invoice_id bigint not null references vendor_invoice(id),
  model text,
  sku text,
  description text,
  metric_name text,
  metric_value numeric(18,6),
  line_amount numeric(18,6),
  currency_code text,
  raw_line jsonb
);

create table reconciliation_run (
  id bigserial primary key,
  run_type text not null,                     -- daily_usage, monthly_invoice
  vendor text,
  period_start timestamptz not null,
  period_end timestamptz not null,
  status text not null default 'running',
  created_at timestamptz not null default now(),
  finished_at timestamptz
);

create table reconciliation_result (
  id bigserial primary key,
  reconciliation_run_id bigint not null references reconciliation_run(id),

  grain text not null,                        -- vendor/day/model/tenant or request
  vendor text not null,
  model text,
  tenant_id text,
  usage_date date,

  internal_metric_name text,
  internal_metric_value numeric(18,6),
  vendor_metric_name text,
  vendor_metric_value numeric(18,6),

  internal_cost numeric(18,6),
  vendor_cost numeric(18,6),
  delta_cost numeric(18,6),
  delta_pct numeric(18,6),

  match_status text not null,                 -- matched/warn/fail/unmatched_internal/unmatched_vendor
  notes text,
  created_at timestamptz not null default now()
);

4) Vendor-specific ingestion strategy¶

OpenAI ingestion¶

Persist request-level usage from API responses into ai_request_fact. Then import Activity CSV and Cost CSV from the Usage Dashboard into vendor_usage_raw, normalize them into vendor_usage_line, and reconcile by date, project, model, and if possible API key / project scope. Since OpenAI’s official export flow is now the supported path for detailed monthly cost breakdowns, I would make that export the vendor-side source of truth. (OpenAI Help Center)

Vertex AI ingestion¶

Enable Cloud Billing detailed export to BigQuery, enable custom metadata labels on every inference call, and enable request-response logging to BigQuery. Then build one ETL that pulls detailed billing rows into vendor_usage_line and one ETL that pulls request/response logs into a staging table for validation and enrichment. (Google Cloud Documentation)

Gemini API direct ingestion¶

Persist usage_metadata from every response into ai_request_fact and enable AI Studio logs for supported calls. Use those logs as a support/audit aid, but do not expect the same accounting-grade line-item billing join that Vertex AI gives you through Cloud Billing export. (Google AI for Developers)

Kling ingestion¶

Store every task submission and every task query response. At request time, compute expected credits from model/version/duration/resolution/audio flags according to the active pricing rules you configure in your adapter. Then refresh task status until terminal and import the final task payload into vendor_usage_line. Reconcile monthly against vendor portal totals or statements. Kling’s own docs show model-specific credit schedules, for example VIDEO 3.0 pricing by mode and resolution. (Kling AI)

5) The cron jobs¶

This is the recommended cadence.

Every 5 minutes¶

Finalize recent internal request facts.

*/5 * * * *  finalize_ai_request_facts

Purpose: make sure every completed request has stable usage fields and cost estimates.

Hourly¶

Pull near-real-time vendor telemetry where available.

15 * * * *   import_openai_usage_activity
20 * * * *   import_kling_task_updates
25 * * * *   import_ai_studio_logs

For OpenAI, this is only worthwhile if you can automate dashboard export in your environment; otherwise run it daily. For Kling, hourly task sync matters because video jobs can complete later than request submission.

Daily, early morning UTC¶

Import accounting-grade vendor data for the previous UTC day.

10 03 * * *  import_openai_cost_exports_prev_day
20 03 * * *  import_vertex_billing_bigquery_prev_day
35 03 * * *  import_vertex_request_response_logs_prev_day
50 03 * * *  import_kling_daily_account_summary
15 04 * * *  run_daily_reconciliation_prev_day
30 04 * * *  send_reconciliation_report

Monthly¶

Lock the period only after vendor data has settled.

30 06 2 * *  import_monthly_vendor_invoices_prev_month
45 06 2 * *  run_monthly_invoice_reconciliation_prev_month
15 07 2 * *  send_monthly_reconciliation_report

I would reconcile the prior month on the 2^nd day of the month, not the 1^st, because vendor exports and credits often settle with a lag.

6) Sample SQL: daily reconciliation¶

This example compares daily internal and vendor cost by vendor/model/tenant.

with internal_daily as (
  select
    request_date as usage_date,
    vendor,
    model,
    coalesce(tenant_id, '_unknown') as tenant_id,
    sum(coalesce(internal_estimated_cost, 0)) as internal_cost,
    sum(coalesce(internal_input_tokens, 0)) as input_tokens,
    sum(coalesce(internal_output_tokens, 0)) as output_tokens,
    sum(coalesce(internal_credits, 0)) as credits
  from ai_request_fact
  where request_date = current_date - interval '1 day'
    and status = 'succeeded'
  group by 1,2,3,4
),
vendor_daily as (
  select
    usage_date,
    vendor,
    model,
    coalesce(billing_label_tenant_id, '_unknown') as tenant_id,
    sum(coalesce(cost_amount, 0)) as vendor_cost
  from vendor_usage_line
  where usage_date = current_date - interval '1 day'
  group by 1,2,3,4
)
insert into reconciliation_result (
  reconciliation_run_id,
  grain,
  vendor,
  model,
  tenant_id,
  usage_date,
  internal_metric_name,
  internal_metric_value,
  vendor_metric_name,
  vendor_metric_value,
  internal_cost,
  vendor_cost,
  delta_cost,
  delta_pct,
  match_status,
  notes
)
select
  :run_id,
  'vendor/day/model/tenant',
  coalesce(i.vendor, v.vendor),
  coalesce(i.model, v.model),
  coalesce(i.tenant_id, v.tenant_id),
  coalesce(i.usage_date, v.usage_date),
  'estimated_cost',
  coalesce(i.internal_cost, 0),
  'vendor_cost',
  coalesce(v.vendor_cost, 0),
  coalesce(i.internal_cost, 0),
  coalesce(v.vendor_cost, 0),
  coalesce(i.internal_cost, 0) - coalesce(v.vendor_cost, 0),
  case
    when coalesce(v.vendor_cost, 0) = 0 and coalesce(i.internal_cost, 0) = 0 then 0
    when coalesce(v.vendor_cost, 0) = 0 then 1
    else (coalesce(i.internal_cost, 0) - v.vendor_cost) / v.vendor_cost
  end,
  case
    when i.vendor is null then 'unmatched_vendor'
    when v.vendor is null then 'unmatched_internal'
    when abs(
      case
        when coalesce(v.vendor_cost, 0) = 0 then 1
        else (coalesce(i.internal_cost, 0) - v.vendor_cost) / v.vendor_cost
      end
    ) <= 0.02 then 'matched'
    when abs(
      case
        when coalesce(v.vendor_cost, 0) = 0 then 1
        else (coalesce(i.internal_cost, 0) - v.vendor_cost) / v.vendor_cost
      end
    ) <= 0.05 then 'warn'
    else 'fail'
  end,
  null
from internal_daily i
full outer join vendor_daily v
  on i.usage_date = v.usage_date
 and i.vendor = v.vendor
 and i.model = v.model
 and i.tenant_id = v.tenant_id;

Thresholds¶

Use:

≤ 2% = matched
2–5% = warn
> 5% = fail
> 10% on a billable tenant = block billing export

That is strict enough for margin protection without turning harmless noise into constant incidents.

7) Sample SQL: request-level reconciliation for Kling¶

Kling is where request-level reconciliation matters more than daily cost totals.

insert into reconciliation_result (
  reconciliation_run_id,
  grain,
  vendor,
  model,
  tenant_id,
  usage_date,
  internal_metric_name,
  internal_metric_value,
  vendor_metric_name,
  vendor_metric_value,
  internal_cost,
  vendor_cost,
  delta_cost,
  delta_pct,
  match_status,
  notes
)
select
  :run_id,
  'request',
  f.vendor,
  f.model,
  f.tenant_id,
  f.request_date,
  'credits',
  coalesce(f.internal_credits, 0),
  'credits',
  coalesce(v.metric_value, 0),
  coalesce(f.internal_estimated_cost, 0),
  coalesce(v.cost_amount, 0),
  coalesce(f.internal_estimated_cost, 0) - coalesce(v.cost_amount, 0),
  case
    when coalesce(v.metric_value, 0) = 0 and coalesce(f.internal_credits, 0) = 0 then 0
    when coalesce(v.metric_value, 0) = 0 then 1
    else (coalesce(f.internal_credits, 0) - v.metric_value) / v.metric_value
  end,
  case
    when v.id is null then 'unmatched_internal'
    when abs(coalesce(f.internal_credits, 0) - coalesce(v.metric_value, 0)) < 0.0001 then 'matched'
    else 'fail'
  end,
  concat('vendor_request_id=', coalesce(f.vendor_request_id, 'n/a'))
from ai_request_fact f
left join vendor_usage_line v
  on v.vendor = 'kling'
 and v.vendor_request_id = f.vendor_request_id
 and v.metric_name = 'credits'
where f.vendor = 'kling'
  and f.request_date = current_date - interval '1 day';

8) Sample BigQuery query for Vertex AI billing¶

This is the most valuable query in the whole design. Because Vertex AI billing export contains labels, you can reconcile directly against your own request metadata. Google documents that labels are forwarded to billing and detailed usage is available in BigQuery. (Google Cloud Documentation)

select
  usage_start_time,
  usage_end_time,
  service.description as service_name,
  sku.description as sku_name,
  project.id as project_id,
  location.location as region,
  labels.value as label_value,
  labels.key as label_key,
  cost,
  currency,
  usage.amount as usage_amount,
  usage.unit as usage_unit
from `YOUR_BILLING_DATASET.gcp_billing_export_resource_v1_*`,
unnest(labels) as labels
where service.description = 'Vertex AI'
  and usage_start_time >= timestamp_sub(current_timestamp(), interval 2 day)
  and labels.key in ('recon_key', 'tenant_id', 'team_id');

Then normalize the result into vendor_usage_line.

9) The reports you should generate¶

Daily reconciliation report¶

Send this to engineering + finance ops every morning.

Include:

total internal estimated cost
total vendor cost ingested
delta $
delta %
top 10 failed model/tenant combinations
unmatched internal requests count
unmatched vendor rows count
vendor ingestion freshness

Example layout:

Date: 2026-04-15 UTC

Vendor summary
- OpenAI: internal $812.14 / vendor $807.90 / delta +$4.24 (+0.52%) => matched
- Google Vertex: internal $1,104.21 / vendor $1,108.87 / delta -$4.66 (-0.42%) => matched
- Kling: internal $392.00 / vendor $428.00 / delta -$36.00 (-8.41%) => fail

Top failures
1. Kling / video_generation / tenant=acme-media / delta -$28.00
2. Gemini direct / gemini-2.5-flash / tenant=demo / unmatched vendor usage
3. OpenAI / gpt-4.1-mini / tenant=internal / delta +$6.10

Monthly invoice reconciliation report¶

This is the accounting one.

Include:

invoice total
summed vendor imported lines
summed internal estimated cost
credits/adjustments/taxes separated
unresolved variance
whether customer billing export may proceed

Example decision rule:

unresolved variance under 1% of invoice: book adjustment
1–3%: finance review
over 3%: hold invoice finalization for affected customer groups

Model drift report¶

This one is operational and very useful.

Include:

models with persistent positive or negative bias over the last 14 days
whether drift maps to a model version change, pricing change, or adapter bug
especially important for Gemini thinking/cached tokens and Kling audio/video mode toggles

10) The most important implementation details¶

For Vertex AI, label every request¶

This is the highest-ROI change you can make. Use labels such as:

{
  "recon_key": "abc123",
  "tenant_id": "t_42",
  "team_id": "marketing",
  "model_key": "gemini-2.5-flash",
  "operation": "text_generation"
}

Google documents that labels are forwarded to billing and can be queried in reports and exports. (Google Cloud Documentation)

For OpenAI, separate metering from reconciliation¶

Use the API response usage for request-level meter accuracy, but use dashboard export CSVs as your vendor accounting source. That avoids tying your finance process to ephemeral per-request data alone. (OpenAI Platform)

For Gemini API direct, do not pretend it is Vertex AI¶

Persist usage_metadata, enable logs, and accept that reconciliation confidence is lower than with Cloud Billing detailed export. If this traffic is financially significant, move it to Vertex AI. (Google AI for Developers)

For Kling, version your pricing rules¶

Kling’s credit schedules are model- and mode-specific, so your adapter must price against a versioned rule table, not hardcoded logic. Their own published pricing shows why. (Kling AI)

A good table for that is:

create table vendor_pricing_rule (
  id bigserial primary key,
  vendor text not null,
  model text not null,
  operation text not null,
  effective_from timestamptz not null,
  effective_to timestamptz,
  rule_version text not null,
  pricing_formula text not null,      -- descriptive
  pricing_params jsonb not null       -- e.g. {"720p_no_audio": 6, "1080p_audio": 12}
);

11) What to do first in your project¶

Move production Gemini traffic to Vertex AI wherever possible. That gives you Cloud Billing detailed export, labels, and request-response logging. (Google Cloud Documentation)
Capture request-level usage for every provider response into ai_request_fact. OpenAI and Gemini both expose request usage in responses. (OpenAI Platform)
Add a versioned pricing-rule table for Kling and any other opaque/credit-based vendor. (Kling AI)
Run daily reconciliation at vendor/model/tenant grain and request-level reconciliation for Kling.
Run monthly invoice reconciliation separately and never rely on the daily job alone for accounting close.

The one-line summary is this:

Use request responses for internal metering, vendor exports for financial truth, and keep a separate reconciliation ledger that compares the two without mutating either one.