MOSS TTS v1.5
Audio
MOSS TTS v1.5
POST
MOSS TTS v1.5
MOSS TTS v1.5 text-to-speech API. Supports both JSON body and multipart (reference audio) requests; returns a complete WAV file or a streaming PCM audio binary. The response is a binary audio stream. When testing with curl, add —output to save it to a file (e.g. —output moss-tts-local.wav), otherwise the binary content prints directly to the terminal.
Request Headers
Supports:
application/json, multipart/form-dataBearer authentication format, for example: Bearer {{API Key}}.
Request Body
- application/json
- multipart/form-data
Required, the text to synthesize. Submitting a complete sentence or paragraph at once is recommended.
Required, fixed value MOSS-TTS to select MOSS TTS v1.5.Optional values:
MOSS-TTSOptional, false returns a complete WAV; true returns a PCM stream suitable for play-while-generating.
Optional, use wav for non-streaming; must be pcm when stream=true.Optional values:
wav, pcmResponse
Returns audio binary on success. Non-streaming is a complete WAV; streaming is raw PCM chunks. Streaming PCM format is described by response headers (defaults to 48000Hz, mono, 16-bit little-endian). Format:binaryLast modified on June 26, 2026