Wan 2.7 Text-to-Video

curl --request POST \
  --url https://api.novita.ai/v3/async/wan2.7-t2v \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "size": "<string>",
  "prompt": "<string>",
  "duration": 123,
  "audio_url": "<string>",
  "watermark": true,
  "prompt_extend": true,
  "negative_prompt": "<string>"
}
'

{
  "task_id": "<string>"
}

POST

async

wan2.7-t2v

Wan 2.7 Text-to-Video

curl --request POST \
  --url https://api.novita.ai/v3/async/wan2.7-t2v \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "size": "<string>",
  "prompt": "<string>",
  "duration": 123,
  "audio_url": "<string>",
  "watermark": true,
  "prompt_extend": true,
  "negative_prompt": "<string>"
}
'

{
  "task_id": "<string>"
}

Wan 2.7 Text-to-Video model generates smooth videos from text prompts. Supports audio-driven generation or automatic soundtrack. 720P and 1080P resolutions, duration 2~15 seconds, billed per second. Output includes audio by default.

This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the video generation results.

Request Headers

Content-Type

string

required

Supports: application/json

Authorization

string

required

Bearer authentication format, for example: Bearer {{API Key}}.

Request Body

seed

integer

Random seed for improving reproducibility. Range: [0, 2147483647].Value range: [0, 2147483647]

size

string

default:"1920*1080"

Output video resolution (widthheight), affects pricing. 720P: 1280720 (16:9), 7201280 (9:16), 960960 (1:1), 1088832 (4:3), 8321088 (3:4). 1080P: 19201080 (16:9), 10801920 (9:16), 14401440 (1:1), 16321248 (4:3), 1248*1632 (3:4).Optional values: 1280*720, 720*1280, 960*960, 1088*832, 832*1088, 1920*1080, 1080*1920, 1440*1440, 1632*1248, 1248*1632

prompt

string

required

Text prompt describing desired video content. Supports Chinese and English, max 1500 characters. Exceeding portion is automatically truncated.Length limit: 0 - 1500

duration

integer

default:5

Video duration in seconds, billed per second. Range: [2, 15] integer.Value range: [2, 15]

audio_url

string

Audio file URL. Model will use this audio to drive video generation (lip-sync, beat-matching, etc.). If not provided, model auto-generates matching background music or sound effects. Formats: wav, mp3. Duration: 3~30s. Max size: 15MB. If audio exceeds video duration it is truncated; if shorter, remaining video is silent.

watermark

boolean

default:false

Add watermark to the output video (bottom-right corner).

prompt_extend

boolean

default:true

Enable intelligent prompt rewriting. Uses a language model to enhance short prompts, improving generation quality at the cost of slightly longer processing time.

negative_prompt

string

Negative prompt describing undesired content in the video. Supports Chinese and English, max 500 characters.Length limit: 0 - 500

Response

task_id

string

Use the task_id to request the Task Result API to retrieve the generated outputs.

Wan 2.7 Reference-to-Video Wan 2.7 Image-to-Video

Overview

Basic

Team Budget

Model APIs

GPUs

Wan 2.7 Text-to-Video

Request Headers

Request Body

Response

Overview

Basic

Team Budget

Model APIs

GPUs

​Request Headers

​Request Body

​Response

Request Headers

Request Body

Response