Skip to main content
POST
https://api.novita.ai
/
v3
/
glm-tts
GLM Text to Speech
curl --request POST \
  --url https://api.novita.ai/v3/glm-tts \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '
{
  "input": "<string>",
  "speed": 123,
  "voice": "<string>",
  "volume": 123,
  "response_format": "<string>",
  "watermark_enabled": true
}
'
Convert text to natural speech using GLM-TTS, supporting multiple voices, emotion control, and tone adjustment.

Request Headers

Content-Type
string
required
Supports: application/json
Authorization
string
required
Bearer authentication format, for example: Bearer {{API Key}}.

Request Body

input
string
required
The text to convert to speechLength limit: 0 - 1024
speed
number
default:1
Speech speed, default is 1.0, range [0.5, 2]Value range: [0.5, 2]
voice
string
default:"tongtong"
required
The voice to use for audio generation, supporting both system voices and cloned voices. System voices include: tongtong (Tongtong, default voice), chuichui (Chuichui), xiaochen (Xiaochen), jam (Dongdong Zoo jam voice), kazi (Dongdong Zoo kazi voice), douji (Dongdong Zoo douji voice), luodo (Dongdong Zoo luodo voice)
volume
number
default:1
Volume, default is 1.0, range (0, 10]Value range: [0, 10]
response_format
string
default:"pcm"
Audio output format, defaults to pcm formatOptional values: wav, pcm
watermark_enabled
boolean
Controls whether to add watermark when generating AI audio. true: Enables explicit watermark and implicit digital watermark for AI-generated content by default, complying with policy requirements. false: Disables all watermarks, only effective for users who have completed watermark removal action.

Response

Request processed successfully, recommended sample rate is 24000 Format: binary