Wan 2.7 Reference-to-Video

curl --request POST \
  --url https://api.novita.ai/v3/async/wan2.7-r2v \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "size": "<string>",
  "audio": true,
  "media": [
    {
      "url": "<string>",
      "type": "<string>",
      "reference_voice": "<string>"
    }
  ],
  "prompt": "<string>",
  "duration": 123,
  "shot_type": "<string>",
  "watermark": true,
  "negative_prompt": "<string>"
}
'

{
  "task_id": "<string>"
}

POST

async

wan2.7-r2v

Wan 2.7 Reference-to-Video

curl --request POST \
  --url https://api.novita.ai/v3/async/wan2.7-r2v \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "seed": 123,
  "size": "<string>",
  "audio": true,
  "media": [
    {
      "url": "<string>",
      "type": "<string>",
      "reference_voice": "<string>"
    }
  ],
  "prompt": "<string>",
  "duration": 123,
  "shot_type": "<string>",
  "watermark": true,
  "negative_prompt": "<string>"
}
'

{
  "task_id": "<string>"
}

Wan 2.7 Reference-to-Video model with multimodal input support (text/image/video). Can generate single-character performance or multi-character interaction videos using reference characters. Supports intelligent multi-shot generation. 720P and 1080P resolutions, duration 2~10 seconds, billed per second. Output includes audio by default.

This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the video generation results.

Request Headers

Content-Type

string

required

Supports: application/json

Authorization

string

required

Bearer authentication format, for example: Bearer {{API Key}}.

Request Body

seed

integer

Random seed for improving reproducibility. Range: [0, 2147483647].Value range: [0, 2147483647]

size

string

default:"1920*1080"

Output video resolution (widthheight), affects pricing. 720P: 1280720 (16:9), 7201280 (9:16), 960960 (1:1), 1088832 (4:3), 8321088 (3:4). 1080P: 19201080 (16:9), 10801920 (9:16), 14401440 (1:1), 16321248 (4:3), 1248*1632 (3:4).Optional values: 1280*720, 720*1280, 960*960, 1088*832, 832*1088, 1920*1080, 1080*1920, 1440*1440, 1632*1248, 1248*1632

audio

boolean

default:true

Whether to generate audio in the video, affects pricing. Default true (with audio).

media

array

required

Reference media array for character appearance, motion and voice extraction. Items map to character1, character2 etc. in order. Images: 0-5, Videos: 0-3, Total <= 5. Image formats: JPEG, JPG, PNG, BMP, WEBP, resolution [240,8000]px, max 10MB. Video formats: MP4, MOV, duration 1-30s, max 100MB. Audio formats: MP3, WAV, FLAC, duration 3-30s.Array length: 1 - 5

Hide properties

url

string

required

Media file URL.

type

string

required

Media type. reference_image: reference image for character appearance; reference_video: reference video for character motion and appearance; first_frame: first frame image to control video starting frame.Optional values: reference_image, reference_video, first_frame

reference_voice

string

Character reference audio URL for voice cloning. Formats: MP3, WAV, FLAC, duration 3~30s.

prompt

string

required

Text prompt describing desired video content. Use character1, character2 etc. to reference characters from media array in order. Each reference (video or image) contains a single character. Supports Chinese and English, max 1500 characters.Length limit: 0 - 1500

duration

integer

default:5

Video duration in seconds, billed per second. Range: [2, 10] integer.Value range: [2, 10]

shot_type

string

default:"single"

Shot type. single = one continuous shot (default), multi = multiple shots with transitions. Takes priority over prompt.Optional values: single, multi

watermark

boolean

default:false

Add watermark to the output video (bottom-right corner).

negative_prompt

string

Negative prompt describing undesired content in the video. Supports Chinese and English, max 500 characters.Length limit: 0 - 500

Response

task_id

string

Use the task_id to request the Task Result API to retrieve the generated outputs.

Wan 2.7 Video Editing Wan 2.7 Text-to-Video

Overview

Basic

Team Budget

Model APIs

GPUs

Wan 2.7 Reference-to-Video

Request Headers

Request Body

Response

Overview

Basic

Team Budget

Model APIs

GPUs

​Request Headers

​Request Body

​Response

Request Headers

Request Body

Response