Wan 2.7 Image-to-Video

curl --request POST \ --url https://api.novita.ai/v3/async/wan2.7-i2v \ --header 'Authorization: <authorization>' \ --header 'Content-Type: <content-type>' \ --data ' { "seed": 123, "prompt": "<string>", "duration": 123, "image_url": "<string>", "watermark": true, "resolution": "<string>", "prompt_extend": true, "first_clip_url": "<string>", "last_frame_url": "<string>", "negative_prompt": "<string>", "driving_audio_url": "<string>" } '

This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the video generation results.

Request Body

seed

integer

Random seed for improving reproducibility. Range: [0, 2147483647].Value range: [0, 2147483647]

prompt

string

Text prompt describing desired video content. Supports Chinese and English, max 5000 characters.Length limit: 0 - 5000

duration

integer

default:5

Video duration in seconds, billed per second. Range: [2, 15] integer.Value range: [2, 15]

image_url

string

required

First frame image URL. Supported formats: JPEG, JPG, PNG (no transparency), BMP, WEBP. Resolution: width and height in [240, 8000] pixels, aspect ratio 1:8~8:1, max 20MB. Mutually exclusive with first_clip_url, at least one must be provided.

watermark

boolean

default:false

Add watermark to the output video (bottom-right corner).

resolution

string

default:"1080P"

Output video resolution tier, affects pricing. Video aspect ratio matches input media.Optional values: 720P, 1080P

prompt_extend

boolean

default:true

Enable intelligent prompt rewriting using LLM. Improves generation quality for short prompts but increases processing time.

first_clip_url

string

First video clip URL for video continuation. The model extends the video based on its content. Supported formats: mp4, mov. Duration: 2~~10s, resolution: width and height in [240, 4096] pixels, aspect ratio 1:8~~8:1, max 100MB. Mutually exclusive with image_url.

last_frame_url

string

Last frame image URL. Combined with first frame to generate first+last frame video. Same format restrictions as first frame.

negative_prompt

string

Negative prompt describing undesired content in the video. Supports Chinese and English, max 500 characters.Length limit: 0 - 500

driving_audio_url

string

Driving audio URL. When provided, the model uses it as the driving source for video generation (e.g., lip sync, motion beats). When omitted, the model auto-generates matching background music or sound effects. Supported formats: wav, mp3. Duration: 2~30s, max 15MB.

Overview

Basic

Team Budget

Model APIs

GPUs

Request Headers

Request Body

Response

Overview

Basic

Team Budget

Model APIs

GPUs

​Request Headers

​Request Body

​Response

Request Headers

Request Body

Response