Novita AI Wan 2.1 Image to Video API | Generate videos from images and text

Accelerated inference for Wan 2.1 14B Image-to-Video, a comprehensive and open suite of video foundation models that pushes the boundaries of video generation. By default, the API will generate a video with 5 seconds.

This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the video generation results.

Request Headers

Content-Type

string

required

Supports: application/json

Authorization

string

required

Bearer authentication format, for example: Bearer {{API Key}}.

Request Body

extra

object

Optional extra parameters for the request.

Show properties

webhook

object

Webhook settings. More details can be found at Webhook Documentation.

Show properties

url

string

required

The URL of the webhook endpoint. Novita AI will send the task generated outputs to your specified webhook endpoint.

test_mode

object

By specifying Test Mode, a mock event will be sent to the webhook endpoint.

Show properties

enabled

boolean

required

Set to true to enable Test Mode, or false to disable it.The default is false.

return_task_status

string

required

Control the data content of the mock event. When set to TASK_STATUS_SUCCEED, you’ll receive a normal response; when set to TASK_STATUS_FAILED, you’ll receive an error response.Supports: TASK_STATUS_SUCCEED, TASK_STATUS_FAILED.

prompt

string

required

Prompt text required to guide the generation.Range: 1 <= x <= 2000.

image_url

string

required

The URL of the image to be used for video generation.

negative_prompt

string

Negative prompts instruct the model on what elements to avoid generating.Range: 0 <= x <= 2000.

width

integer

Width of the output video.Supports: 480, 720, 832, 1280.Default: 832. If the width or height is not specified, the width and the height will be forced to 832 and 480 respectively.

height

integer

Height of the output video.Supports:

(480p) 832 for width of 480
(480p) 480 for width of 832
(720p) 1280 for width of 720
(720p) 720 for width of 1280

Default: 480. If the width or height is not specified, the width and the height will be forced to 832 and 480 respectively.

The output video will maintain the input image’s aspect ratio, and the width x height setting only determines the output video’s clarity. For example, a 720p video will be clearer than a 480p video.

loras

object[]

LoRA models to be applied to the video generation.Currently supports up to 3 LoRAs.

Show properties

path

string

required

The path to the LoRA model. You can specify either a LoRA model name from Hugging Face, for example: Remade-AI/Painting; or a model download URL from Civitai, for example: https://civitai.com/api/download/models/1513385?type=Model&format=SafeTensor.

The LoRA model must be compatible with Wan2.1 14B I2V, otherwise it will not work. Please check compatibility before using it.

scale

number(float32)

required

The scale value of lora. The larger the value, the more biased the effect is towards lora.Range: 0 <= x <= 4.0.

seed

integer

A seed is a number generates noise, which, makes generation deterministic. Using the same seed and set of parameters will produce identical content each time.Range: -1 <= x <= 9999999999. Default: -1.

steps

integer

The number of inference steps.Range: 1 <= x <= 40. Default: 30.

guidance_scale

float

Guidance scale parameter controls how closely the generated content follows the prompt.Range: 0 <= x <= 10. Default: 5.0.

flow_shift

float

The flow_shift parameter primarily affects the speed and magnitude of object movement in the video. Higher values produce more pronounced and faster movement, while lower values make the motion slower and more subtle.Range: 1 <= x <= 10. Default: 5.0.

enable_safety_checker

boolean

The enable_safety_checker parameter controls whether the safety filter is applied to the generated content. When enabled, it helps filter out potentially harmful or inappropriate content from the video output.Default: true.

fast_mode

boolean

Whether to enable fast mode, which will generate videos more quickly but may reduce quality and lower the price.Default: false.

Response

task_id

string

required

Use the task_id to request the Task Result API to retrieve the generated outputs.

Example

Here is an example of how to use the Wan 2.1 Image to Video API.

Generate a task_id by sending a POST request to the Wan 2.1 Image to Video API.

Request:

curl --location 'https://api.novita.ai/v3/async/wan-i2v' \
--header 'Authorization: Bearer {{API Key}}' \
--header 'Content-Type: application/json' \
--data '{
    "image_url": "https://pub-f964a1c641c04024bce400ad128c8cd6.r2.dev/wan-i2v-input-image.jpg",
    "height": 1280,
    "width": 720,
    "steps": 25,
    "seed": -1,
    "prompt": "A cute panda is walking in the grassland slowly."
}'

Response:

{
    "task_id": "{Returned Task ID}"
}

Use task_id to get output videos.

HTTP status codes in the 2xx range indicate that the request has been successfully accepted, while status codes in the 5xx range indicate internal server errors. You can get videos url in videos of response. Request:

curl --location --request GET 'https://api.novita.ai/v3/async/task-result?task_id={Returned Task ID}' \
--header 'Authorization: Bearer {{API Key}}'

Response:

{
    "task": {
        "task_id": "{Returned Task ID}",
        "task_type": "WAN_IMG_TO_VIDEO",
        "status": "TASK_STATUS_SUCCEED",
        "reason": "",
        "eta": 0,
        "progress_percent": 100
    },
    "images": [],
    "videos": [
        {
            "video_url": "{The URL of the generated video}",
            "video_url_ttl": "3600",
            "video_type": "mp4"
        }
    ]
}

Video files:

Basic

Model APIs

GPUs

Wan 2.1 Image to Video

Request Headers

Request Body

Response

Example

Basic

Model APIs

GPUs

​Request Headers

​Request Body

​Response

​Example

Request Headers

Request Body

Response

Example