logoComputeVault
User GuideAPI ReferenceHelp & SupportBusiness Cooperation

Veo Model Image-to-Video API Documentation

Veo Model Image-to-Video API Documentation

Veo is a high-quality image-to-video generation model developed by Google. This document describes the complete API interface specification for using Google Veo model for image-to-video generation. All video generation calls use the same /v1/video/generations endpoint, with different parameters depending on the use case. Image data is provided as a base64-encoded string.


Supported Models

Currently supported models include:

ModelDescription
veo-3.0-generate-001Veo 3.0 image-to-video generation model
veo-3.0-fast-generate-001Veo 3.0 fast image-to-video generation model
veo-3.1-generate-001Veo 3.1 image-to-video generation model
veo-3.1-fast-generate-001Veo 3.1 fast image-to-video generation model

Overview

The Veo model image-to-video feature provides an asynchronous task processing mechanism:

  1. Submit Task: Send an image and text prompt to create a video generation task
  2. Query Status: Query generation progress and status through task ID
  3. Get Results: Retrieve the generated video file after task completion

Task Status Flow

queued → in_progress → completed

            failed
  • queued: Task has been submitted and is waiting to be processed
  • in_progress: Task is being processed
  • completed: Task completed successfully, video has been generated
  • failed: Task failed

API List

MethodPathDescription
POST/v1/video/generationsSubmit video generation task (standard format)
GET/v1/video/generations/{task_id}Query task status (standard format)
POST/v1/videosSubmit video generation task
GET/v1/videos/{task_id}Query task status
GET/v1/videos/{task_id}/contentGet video content (streaming download)

Usage Examples

1. Basic Image-to-Video

The simplest form of image-to-video generation uses a single image as the first frame.

Request Body:

{
  "model": "veo-3.1-generate-001",
  "prompt": "A cat playing piano in a beautiful garden",
  "image": "<BASE64_ENCODED_IMAGE_DATA>",
  "metadata": {}
}

2. First and Last Frames

The image in the image field specifies the first frame of the video. The image in metadata.lastFrame specifies the last frame. This allows you to control both the starting and ending frames of the generated video.

Note: This feature is only supported by Veo 3.1 models.

Request Body:

{
  "model": "veo-3.1-generate-001",
  "prompt": "A cat playing piano in a beautiful garden",
  "image": "<BASE64_ENCODED_IMAGE_DATA>",
  "metadata": {
    "lastFrame": "<BASE64_ENCODED_IMAGE_DATA>"
  }
}

3. Reference Images

Images are specified in an array in metadata.referenceImages, containing up to 3 elements. Each reference image is an object containing image: base64-encoded image data and referenceType: a string with value "asset" or "style".

Note: This feature is only supported by veo-3.1-generate-001.

Request Body:

{
  "model": "veo-3.1-generate-001",
  "prompt": "A cat playing piano in a beautiful garden",
  "image": "<BASE64_ENCODED_IMAGE_DATA>",
  "metadata": {
    "referenceImages": [
      {
        "image": "<BASE64_ENCODED_IMAGE_DATA>",
        "referenceType": "asset"
      },
      {
        "image": "<BASE64_ENCODED_IMAGE_DATA>",
        "referenceType": "style"
      }
    ]
  }
}

Request Parameters:

ParameterTypeRequiredDescription
modelstringYesModel name, e.g., veo-3.1-generate-001
promptstringYesText prompt describing the video content to be generated
imagestringYesBase64-encoded image data for the first frame
metadataobjectNoExtended parameters object

metadata Parameters:

ParameterTypeRequiredDescription
aspectRatiostringNoVideo aspect ratio, options: "16:9", "9:16"
durationSecondsnumberNoVideo duration (seconds), options: 4, 6, 8
negativePromptstringNoNegative prompt describing content not desired in the video
personGenerationstringNoPerson generation strategy, options: "allow_all" (text-to-video), "allow_adult" (image-to-video)
resolutionstringNoVideo resolution, e.g., "1080p", "720p"
sampleCountnumberNoNumber of videos to generate, default 1
storageUristringNoGoogle Cloud Storage URI for storing generated videos
lastFramestringNoBase64-encoded image data for the last frame (Veo 3.1 models only)
referenceImagesarrayNoArray of reference images, up to 3 elements (veo-3.1-generate-001 only)

referenceImages Array Elements:

ParameterTypeRequiredDescription
imagestringYesBase64-encoded image data
referenceTypestringYesReference type, options: "asset" or "style"

1. Submit Video Generation Task

Complete Request:

curl -X POST "https://computevault.unodetech.xyz/v1/video/generations" -H "Content-Type: application/json" -H "Authorization: Bearer API_KEY" -d @veoImageToVideoTest.json

Endpoint:

POST /v1/video/generations

Request Headers:

ParameterTypeRequiredDescription
Content-TypestringYesapplication/json
AuthorizationstringYesBearer API_KEY

Response Example:

{
  "task_id": "TASK_ID"
}

Response Field Descriptions:

FieldTypeDescription
task_idstringTask ID for subsequent task status queries

2. Query Task Status

Complete Standard Format Endpoint

curl -X GET "https://computevault.unodetech.xyz/v1/video/generations/TASK_ID" -H "Authorization: Bearer API_KEY"

Endpoint:

GET /v1/video/generations/{task_id}

Request Headers:

ParameterTypeRequiredDescription
AuthorizationstringYesBearer API_KEY

Path Parameters:

ParameterTypeRequiredDescription
task_idstringYesTask ID

Response Example (Processing):

{
  "code": "success",
  "message": "",
  "data": {
    "bytes_base64_encoded": "",
    "error": null,
    "format": "mp4",
    "metadata": null,
    "status": "processing",
    "task_id": "TASK_ID",
    "url": ""
  }
}

Response Example (Success):

{
  "code": "success",
  "message": "",
  "data": {
    "bytes_base64_encoded": "",
    "error": null,
    "format": "mp4",
    "metadata": null,
    "status": "succeeded",
    "task_id": "TASK_ID",
    "url": "https://computevault.unodetech.xyz/v1/videos/TASK_ID/content"
  }
}

Note: Depending on the AI service provider, the video will be returned either as base64-encoded data in the bytes_base64_encoded field (Vertex) or via a content URL in the url field (Gemini).

Response Example (Failed):

{
  "code": "success",
  "message": "",
  "data": {
    "bytes_base64_encoded": "",
    "error": null,
    "format": "mp4",
    "metadata": null,
    "status": "failed",
    "task_id": "TASK_ID",
    "url": "Reference to video does not support this mix of reference images."
  }
}

When a task fails, the url field contains the error message instead of a video URL.

Response Field Descriptions:

FieldTypeDescription
codestringResponse status code, "success" indicates success
dataobjectTask data object
data.task_idstringTask ID
data.statusstringTask status: queued, in_progress, succeeded, failed
data.formatstringVideo format, e.g., "mp4"
data.urlstringVideo access URL (when task succeeds), or error message (when task fails)
data.bytes_base64_encodedstringBase64-encoded video data (when available)
data.errorobjectError information (when task fails)
messagestringError message

Important Notice

NOTE: Due to Google's Responsible AI Guidelines, some tasks going through the Gemini channel may return a successful response but have their video outputs blocked. In this case, the filtering details will be visible in the metadata.rai_media_filtered_count and metadata.rai_media_filtered_reasons fields like in the below example:

{
  "code": "success",
  "message": "",
  "data": {
    "bytes_base64_encoded": "",
    "error": null,
    "format": "mp4",
    "metadata": {
      "rai_media_filtered_count": 1,
      "rai_media_filtered_reasons": ["Sorry, we can't create videos with real people's names or likenesses. Please remove the celebrity reference and try again."]
    },
    "status": "succeeded",
    "task_id": "bW9kZWxzL3Zlby0zLjAtZmFzdC1nZW5lcmF0ZS0wMDEvb3BlcmF0aW9ucy9hd2IxZDhsNDVydGM",
    "url": "Sorry, we can't create videos with real people's names or likenesses. Please remove the celebrity reference and try again."
  }
}

How is this guide?

Last updated on