Novita AI Wan 2.2 Image to Video API | Generate videos from images and text

Wan 2.2 Professional Image-to-Video model generates a 5-second silent video based on the initial frame image and text. It offers significant improvements in visual detail and motion stability.

This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the video generation results.

Request Headers

Content-Type

string

required

Supports: application/json

Authorization

string

required

Bearer authentication format, for example: Bearer {{API Key}}.

Request Body

input

object

required

Basic input information, such as prompts.

Hide properties

prompt

string

Text prompt. Supports both English and Chinese, with a maximum length of 800 characters, and any excess will be automatically truncated.Example value: A small cat running on the grass.

negative_prompt

string

Negative prompt, used to describe content that should be avoided in the video, allowing for restrictions on the video content.Supports both English and Chinese, with a maximum length of 500 characters. Any excess will be automatically truncated.Example value: Low resolution, errors, worst quality, low quality, incomplete, extra fingers, disproportionate, etc.

img_url

string

required

The URL of the initial frame image used for video generation.The URL must be publicly accessible and support HTTP or HTTPS protocols.Image restrictions:

Image formats: JPEG, JPG, PNG (no support for transparency), BMP, WEBP.
Image resolution: The width and height of the image should be within the range of [360, 2000] pixels.
File size: No more than 10MB.

parameters

object

Video processing parameters, such as specifying the output video resolution and duration.

Show properties

resolution

string

The resolution tier of the generated video. Options: 480P, 1080P. The default value is 1080P.Example value: 1080P.Note: The impact of resolution tier on the generated video’s resolution.The model will attempt to maintain the aspect ratio of the output video consistent with the input image, adjusting the total pixels of the video to be near the selected tier.

480P: Video resolution typically refers to 640×480 (approximately 310,000 pixels), with an aspect ratio of 4:3.
1080P: Video resolution typically refers to 1920×1080 (approximately 2,070,000 pixels), with an aspect ratio of 16:9.

Example: If the input image has an aspect ratio of 4:5 and the 480P tier is selected, the output video will maintain a 4:5 aspect ratio, with a resolution adjusted to be close to 310,000 pixels. For instance, the output video resolution might be 480×600, totaling 288,000 pixels (this data is for reference only, actual output may vary).

duration

integer

The duration of the generated video, with a default value of 5 (currently fixed at 5 seconds and not modifiable), in seconds.Example value: 5.

prompt_extend

bool

Whether to enable intelligent prompt rewriting. When enabled, a large model is used to intelligently rewrite the input prompt. This significantly improves the generation effect for shorter prompts but increases processing time.

true: Default value, enable intelligent rewriting.
false: Do not enable intelligent rewriting.

Example value: true.

seed

integer

Random seed, used to control the randomness of the model’s generated content. The value range is [0, 2147483647].If not provided, the algorithm automatically generates a random number as the seed. To maintain relatively stable generated content, you can use the same seed parameter value.Example value: 12345.

Response

task_id

string

required

Use the task_id to request the Task Result API to retrieve the generated outputs.

Basic

Model APIs

GPUs

Wan 2.2 Image to Video

Request Headers

Request Body

Response

Basic

Model APIs

GPUs

​Request Headers

​Request Body

​Response

Request Headers

Request Body

Response