Novita AI Text-To-Speech API | Natural-Sounding Speech Synthesis

POST

async

txt2speech

curl --request POST \
  --url https://api.novita.ai/v3/async/txt2speech \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '{
  "extra": {
    "response_audio_type": "<string>",
    "webhook": {
      "url": "<string>",
      "test_mode": {
        "enabled": true,
        "return_task_status": "<string>"
      }
    },
    "enterprise_plan": {
      "enabled": true
    }
  },
  "request": {
    "voice_id": "<string>",
    "language": "<string>",
    "texts": [
      "<string>"
    ],
    "volume": 123,
    "speed": 123
  }
}'

{
  "task_id": "<string>"
}

This Text-To-Speech API converts written text into natural-sounding speech across multiple languages. Utilizing advanced voice synthesis technology, it delivers clear and lifelike vocal output, suitable for a wide range of applications, including e-learning platforms, accessibility tools, virtual assistants, and multimedia presentations.

This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the speech generation results.

Request Headers

Content-Type

string

required

Enum: application/json

Authorization

string

required

Bearer authentication format, for example: Bearer {{API Key}}.

Request Body

extra

object

Optional extra parameters for the request.

request

object

required

voice_id

string

required

Voice ID
Enum: Emily, James, Olivia, Michael, Sarah, John

language

string

required

Identify the languages spoken in the generated audio
Enum: en-US, zh-CN, ja-JP

texts

string[]

required

Source text for synthetic speech, UTF-8 encoded, supporting a maximum of 512 characters.

volume

number

Control the volume of the generated audio; select a value between 1.0 and 2.0. The default value is 1.0.

speed

number

Control the speed of the generated audio; select a value between 0.8 and 3.0. The default value is 1.0.

Response

task_id

string

Use the task_id to request the Task Result API to retrieve the generated outputs.

Example

request

curl --location 'https://api.novita.ai/v3/async/txt2speech' \
--header 'Authorization: Bearer {{API Key}}' \
--header 'Content-Type: application/json' \
--data '{
  "request": {
    "voice_id": "Emily",
    "language": "en-US",
    "texts": [
      "To be or not to be, that is a question."
    ],
    "volume": 1.2,
    "speed": 1.0
  }
}'

response

{
  "task_id": "b49df8dc-4a72-474b-a863-xxx"
}

Minimax Hailuo-02 Merge Face

curl --request POST \
  --url https://api.novita.ai/v3/async/txt2speech \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: <content-type>' \
  --data '{
  "extra": {
    "response_audio_type": "<string>",
    "webhook": {
      "url": "<string>",
      "test_mode": {
        "enabled": true,
        "return_task_status": "<string>"
      }
    },
    "enterprise_plan": {
      "enabled": true
    }
  },
  "request": {
    "voice_id": "<string>",
    "language": "<string>",
    "texts": [
      "<string>"
    ],
    "volume": 123,
    "speed": 123
  }
}'

{
  "task_id": "<string>"
}

Basic

Model APIs

GPUs

Text to Speech

Request Headers

Request Body

Response

Example

Basic

Model APIs

GPUs

​Request Headers

​Request Body

​Response

​Example

Request Headers

Request Body

Response

Example