Wan 2.5 Preview Text-to-Video model can generate high-quality video content based on text descriptions, and can produce videos with durations of 5 seconds and 10 seconds. New audio capabilities: supports automatic dubbing, or you can provide a custom audio file.
This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the video generation results.
Text prompt. Supports both English and Chinese, with a maximum length of 2000 characters, and any excess will be automatically truncated.Example value: A small cat running under the moonlight.
Negative prompt, used to describe content that should be avoided in the video, allowing for restrictions on the video content.Supports both English and Chinese, with a maximum length of 500 characters. Any excess will be automatically truncated.Example value: Low resolution, errors, worst quality, low quality, incomplete, extra fingers, disproportionate, etc.
URL of the audio file that the model will use to generate the video. See audio settings for usage instructions.Audio restrictions:
Format: wav, mp3.
Duration: 3-30s.
File size: No more than 15MB.
Overflow handling: If the audio length exceeds the duration value (5 seconds or 10 seconds), the first 5 seconds or 10 seconds will be automatically truncated, and the rest will be discarded. If the audio length is shorter than the video duration, the part beyond the audio length will be silent video. For example, if the audio is 3 seconds and the video duration is 5 seconds, the output video will have sound for the first 3 seconds and be silent for the last 2 seconds.
Supports resolutions for 480P, 720P, and 1080P. Default value: 1920*1080 (1080P).
Used to specify the video resolution in the format of width*height. The supported resolutions for different models are as follows:480P tier: Available video resolutions and their corresponding aspect ratios are:
832*480: 16:9.
480*832: 9:16.
624*624: 1:1.
720P tier: Available video resolutions and their corresponding aspect ratios are:
1280*720: 16:9.
720*1280: 9:16.
960*960: 1:1.
1088*832: 4:3.
832*1088: 3:4.
1080P tier: Available video resolutions and their corresponding aspect ratios are:
1920*1080: 16:9.
1080*1920: 9:16.
1440*1440: 1:1.
1632*1248: 4:3.
1248*1632: 3:4.
Common misconceptions about the size parameter: The size must be set to the specific values of the target resolution (e.g., 1280*720), not the aspect ratio (e.g., 1:1) or resolution tier name (e.g., 480P or 720P).
Whether to enable intelligent prompt rewriting. When enabled, a large model is used to intelligently rewrite the input prompt. This significantly improves the generation effect for shorter prompts but increases processing time.
Random seed, used to control the randomness of the model’s generated content. The value range is [0, 2147483647].If not provided, the algorithm automatically generates a random number as the seed. To maintain relatively stable generated content, you can use the same seed parameter value.