Skip to main content
AI

Speech Synthesis

Use MiniMax Speech HD models to generate speech from text. Voice management APIs can list system voices and create reusable private voice_id values through voice cloning or voice design.

When To Use It

  • Generate voiceover for short videos, ads, courses, narration, or digital humans.
  • Use a public voice from the voice list.
  • Clone or design a private voice, then pass its voice_id into a speech task.

Supported Models

ModelTypeRecommended use
minimax-speech-2.8-hdText to speechNatural short-form speech, emotional narration, ads, digital-human voiceover.
minimax-speech-02-hdText to speechAudiobooks, course narration, customer-service voice, news reading, long-form narration.

Endpoint

POST /v1/audio/tasks
GET /v1/audio/tasks/{task_id}
GET /v1/audio/voices
POST /v1/audio/voices/clone
POST /v1/audio/voices/design

Authentication

Authorization: Bearer sk-***
Content-Type: application/json

Speech Request Parameters

ParameterTypeRequiredDescription
modelstringYesminimax-speech-2.8-hd or minimax-speech-02-hd.
textstringYesText to synthesize.
voice_idstringYesPublic or private voice ID.
speednumberNoSpeaking speed when supported by the model.
volumenumberNoOutput volume when supported.
pitchnumberNoPitch adjustment when supported.
formatstringNoOutput format, such as mp3 or wav, when supported.
languagestringNoLanguage hint for multilingual text.
audio_settingobjectNoAdvanced audio settings passed through to the provider when supported.

Speech Request Example

curl -X POST "{BASE_URL}/v1/audio/tasks" \
-H "Authorization: Bearer sk-***" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax-speech-2.8-hd",
"text": "Welcome back. Today we will introduce a faster way to build AI applications.",
"voice_id": "voice_xxx",
"format": "mp3",
"speed": 1
}'

Submit Response

{
"task_id": "task_xxx",
"status": "queued",
"model": "minimax-speech-2.8-hd",
"created_at": 1773980459
}

Query Task Status

curl "{BASE_URL}/v1/audio/tasks/task_xxx" \
-H "Authorization: Bearer sk-***"
{
"task_id": "task_xxx",
"status": "succeeded",
"progress": "100%",
"output": {
"audio_url": "https://example.com/speech.mp3"
},
"error": null
}

Voice Management

List available voices:

curl "{BASE_URL}/v1/audio/voices" \
-H "Authorization: Bearer sk-***"

Clone a voice from reference audio:

curl -X POST "{BASE_URL}/v1/audio/voices/clone" \
-H "Authorization: Bearer sk-***" \
-H "Content-Type: application/json" \
-d '{
"name": "brand-narrator",
"audio_url": "https://example.com/reference.wav"
}'

Design a voice from text:

curl -X POST "{BASE_URL}/v1/audio/voices/design" \
-H "Authorization: Bearer sk-***" \
-H "Content-Type: application/json" \
-d '{
"name": "warm-host",
"description": "Warm, clear, young adult narrator for product explainers"
}'

Use the returned voice_id in later speech tasks.

Billing Notes

Speech generation is billed by generated audio usage. Voice cloning and voice design may be billed separately because they create reusable private voices. Exact prices should come from the current product pricing surface.

Common Errors

  • Missing text or voice_id.
  • Passing a private voice_id that is not visible to the current account.
  • Submitting very long text without splitting it into smaller tasks.
  • Using an unsupported output format.
  • Insufficient balance or disabled API key.