# MOSS TTS v1.5 - Documentation

> For the complete documentation index, see [llms.txt](/llms.txt). Markdown is available with `Accept: text/markdown` and `.md` URL variants.

Source: /docs/api-reference/model-apis-speech

# MOSS TTS v1.5

POST

/

v3

/

moss-tts

/

v1

/

audio

/

speech

MOSS TTS v1.5

cURL

```
curl --request POST \
--url https://api.novita.ai/v3/moss-tts/v1/audio/speech \
--header 'Authorization: &#x3C;authorization>' \
--header 'Content-Type: &#x3C;content-type>'
```

MOSS TTS v1.5 text-to-speech API. Supports both JSON body and multipart (reference audio) requests; returns a complete WAV file or a streaming PCM audio binary. The response is a binary audio stream. When testing with curl, add —output to save it to a file (e.g. —output moss-tts-local.wav), otherwise the binary content prints directly to the terminal.

##

[​](#request-headers)

Request Headers

[​](#param-content-type)

Content-Type

string

required

Supports: `application/json`, `multipart/form-data`

[​](#param-authorization)

Authorization

string

required

Bearer authentication format, for example: Bearer {{API Key}}.

##

[​](#request-body)

Request Body

-

application/json

-

multipart/form-data

[​](#param-input)

input

string

required

Required, the text to synthesize. Submitting a complete sentence or paragraph at once is recommended.

[​](#param-model)

model

string

default:"MOSS-TTS"

required

Required, fixed value MOSS-TTS to select MOSS TTS v1.5.Optional values: `MOSS-TTS`

[​](#param-stream)

stream

boolean

default:false

Optional, false returns a complete WAV; true returns a PCM stream suitable for play-while-generating.

[​](#param-response-format)

response_format

string

default:"wav"

Optional, use wav for non-streaming; must be pcm when stream=true.Optional values: `wav`, `pcm`

[​](#param-ref-audio)

ref_audio

string

Reference audio file field; upload WAV or MP3.

[​](#param-request-json)

request_json

string

required

Required, a string whose content is the JSON body fields, e.g. model, input, stream, response_format.

##

[​](#response)

Response

Returns audio binary on success. Non-streaming is a complete WAV; streaming is raw PCM chunks. Streaming PCM format is described by response headers (defaults to 48000Hz, mono, 16-bit little-endian).
Format: `binary`

Last modified on June 26, 2026
