POST
/
v3
/
async
/
txt2speech

This Text-To-Speech API converts written text into natural-sounding speech across multiple languages. Utilizing advanced voice synthesis technology, it delivers clear and lifelike vocal output, suitable for a wide range of applications, including e-learning platforms, accessibility tools, virtual assistants, and multimedia presentations.

This is an asynchronous API; only the task_id will be returned. You should use the task_id to request the Task Result API to retrieve the speech generation results.

Request Headers

Content-Type
string
required

Enum: application/json

Authorization
string
required

Bearer authentication format, for example: Bearer {{API Key}}.

Request Body

extra
object

Optional extra parameters for the request.

request
object
required
voice_id
string
required

Voice ID
Enum: Emily, James, Olivia, Michael, Sarah, John

language
string
required

Identify the languages spoken in the generated audio
Enum: en-US, zh-CN, ja-JP

texts
string[]
required

Source text for synthetic speech, UTF-8 encoded, supporting a maximum of 512 characters.

volume
number

Control the volume of the generated audio; select a value between 1.0 and 2.0. The default value is 1.0.

speed
number

Control the speed of the generated audio; select a value between 0.8 and 3.0. The default value is 1.0.

Response

task_id
string

Use the task_id to request the Task Result API to retrieve the generated outputs.

Example

request

curl --location 'https://api.novita.ai/v3/async/txt2speech' \
--header 'Authorization: Bearer {{API Key}}' \
--header 'Content-Type: application/json' \
--data '{
  "request": {
    "voice_id": "Emily",
    "language": "en-US",
    "texts": [
      "To be or not to be, that is a question."
    ],
    "volume": 1.2,
    "speed": 1.0
  }
}'

response

{
  "task_id": "b49df8dc-4a72-474b-a863-xxx"
}