Wan 2.7 Text-to-Video model generates smooth videos from text prompts. Supports audio-driven generation or automatic soundtrack. 720P and 1080P resolutions, duration 2~15 seconds, billed per second. Output includes audio by default.
This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the video generation results.
Supports: application/json
Bearer authentication format, for example: Bearer {{API Key}}.
Request Body
Random seed for improving reproducibility. Range: [0, 2147483647].Value range: [0, 2147483647]
size
string
default:"1920*1080"
Output video resolution (widthheight), affects pricing. 720P: 1280720 (16:9), 7201280 (9:16), 960960 (1:1), 1088832 (4:3), 8321088 (3:4). 1080P: 19201080 (16:9), 10801920 (9:16), 14401440 (1:1), 16321248 (4:3), 1248*1632 (3:4).Optional values: 1280*720, 720*1280, 960*960, 1088*832, 832*1088, 1920*1080, 1080*1920, 1440*1440, 1632*1248, 1248*1632
Text prompt describing desired video content. Supports Chinese and English, max 1500 characters. Exceeding portion is automatically truncated.Length limit: 0 - 1500
Video duration in seconds, billed per second. Range: [2, 15] integer.Value range: [2, 15]
Audio file URL. Model will use this audio to drive video generation (lip-sync, beat-matching, etc.). If not provided, model auto-generates matching background music or sound effects. Formats: wav, mp3. Duration: 3~30s. Max size: 15MB. If audio exceeds video duration it is truncated; if shorter, remaining video is silent.
Add watermark to the output video (bottom-right corner).
Enable intelligent prompt rewriting. Uses a language model to enhance short prompts, improving generation quality at the cost of slightly longer processing time.
Negative prompt describing undesired content in the video. Supports Chinese and English, max 500 characters.Length limit: 0 - 500
Response
Use the task_id to request the Task Result API to retrieve the generated outputs.