Google Veo 3¶
Overview¶
Veo 3 is Google’s cinematic video generation line on Vertex AI. Google documents it as a video model for text-to-video and image-to-video generation, with creative controls, prompt rewriting, and support for richer production workflows than early video APIs. For engineers and architects, the practical takeaway is that Veo is designed around an asynchronous submit-and-poll lifecycle, not a fast single-response request.
API base URL¶
Example payload¶
{
"instances": [
{
"prompt": "Slow cinematic drone shot over a futuristic skyline",
"duration": "8s",
"aspectRatio": "16:9",
"fps": 24,
"resolution": "1080p",
"sampleCount": 1
}
],
"parameters": {
"aspectRatio": "16:9",
"durationSeconds": "8s",
"resolution": "1080p",
"safetySetting": "block_only_high",
"personGeneration": "allow_all",
"includeRaiReason": true
}
}
Our Current Pricing¶
Veo billing is implemented in the video workflow rather than being read from the provider response. The code computes cost from duration, model line, resolution tier, quality multiplier, and requested video count. The workflow treats veo-3.0-generate as 0.40 USD/sec and veo-3.0-fast as 0.15 USD/sec.
Quality multipliers are:
Final formula:
total_cost_usd =
duration_seconds *
base_cost_per_second(model, resolution) *
quality_multiplier(quality) *
video_count
Final rounding is to 6 decimal places.
The model record itself also contains a simpler per-second price map with an audio modifier, but MODEL_COSTS_OVERVIEW.md is explicit that the effective workflow formula is the one above. For billing, prefer the workflow logic because that is what the application actually executes.
The best calculation strategy is request-driven. Store model, duration, resolution, quality, and sampleCount or requested count before submission, then compute cost deterministically once the request is accepted. If the provider later returns richer usage or audio-specific metering, you can compare it for reconciliation, but it is not the current source of truth.