Fish Audio API for creating a voice model (voice cloning).
Bearer authentication format, for example: Bearer {{API Key}}.
Request Body
Model type, tts is for text to speech. Available options: tts Allowed value: "tts"
Model train mode, for TTS model, fast means model instantly available after creation. Available options: fast Allowed value: "fast"
Upload voices files that will be used to tune the model.
visibility
enum<string>
default: "public"
Model visibility, public will be shown in the discovery page, unlist allows anyone with the link to access, private only be visible to the creator. Available options: public, unlist, private
Model cover image, this is required if the model is public.
Texts corresponding to the voices, if unspecified, ASR will be performed on the voices.
Response
Unique identifier for the created model.
Model type. Available options: svc, tts
URL of the model cover image.
Current state of the model. Available options: created, training, trained, failed
created_at
string<date-time>
required
Timestamp when the model was created.
updated_at
string<date-time>
required
Timestamp when the model was last updated.
Model visibility setting. Available options: public, unlist, private
Number of likes the model has received.
Number of marks/bookmarks the model has received.
Number of times the model has been shared.
Number of tasks associated with the model.
author
AuthorEntity · object
required
Information about the model author. Author’s unique identifier.
URL of the author’s avatar image.
train_mode
enum<string>
default: "full"
Training mode used for the model. Available options: fast, full
Sample data associated with the model. Text content of the sample.
Task identifier for the sample.
URL of the sample audio file.
Languages supported by the model.
Whether the visibility setting is locked.
Whether the current user has unliked the model.
Whether the current user has liked the model.
Whether the current user has marked/bookmarked the model.