Speech Synthesis

Use MiniMax Speech HD models to generate speech from text. Voice management APIs can list system voices and create reusable private voice_id values through voice cloning or voice design.

When To Use It

Generate voiceover for short videos, ads, courses, narration, or digital humans.
Use a public voice from the voice list.
Clone or design a private voice, then pass its voice_id into a speech task.

Supported Models

Model	Type	Recommended use
`minimax-speech-2.8-hd`	Text to speech	Natural short-form speech, emotional narration, ads, digital-human voiceover.
`minimax-speech-02-hd`	Text to speech	Audiobooks, course narration, customer-service voice, news reading, long-form narration.

Endpoint

POST /v1/audio/tasks
GET  /v1/audio/tasks/{task_id}
GET  /v1/audio/voices
POST /v1/audio/voices/clone
POST /v1/audio/voices/design

Authentication

Authorization: Bearer sk-***
Content-Type: application/json

Speech Request Parameters

Parameter	Type	Required	Description
`model`	string	Yes	`minimax-speech-2.8-hd` or `minimax-speech-02-hd`.
`text`	string	Yes	Text to synthesize.
`voice_id`	string	Yes	Public or private voice ID.
`speed`	number	No	Speaking speed when supported by the model.
`volume`	number	No	Output volume when supported.
`pitch`	number	No	Pitch adjustment when supported.
`format`	string	No	Output format, such as `mp3` or `wav`, when supported.
`language`	string	No	Language hint for multilingual text.
`audio_setting`	object	No	Advanced audio settings passed through to the provider when supported.

Speech Request Example

curl -X POST "{BASE_URL}/v1/audio/tasks" \
  -H "Authorization: Bearer sk-***" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax-speech-2.8-hd",
    "text": "Welcome back. Today we will introduce a faster way to build AI applications.",
    "voice_id": "voice_xxx",
    "format": "mp3",
    "speed": 1
  }'

Submit Response

{
  "task_id": "task_xxx",
  "status": "queued",
  "model": "minimax-speech-2.8-hd",
  "created_at": 1773980459
}

Query Task Status

curl "{BASE_URL}/v1/audio/tasks/task_xxx" \
  -H "Authorization: Bearer sk-***"

{
  "task_id": "task_xxx",
  "status": "succeeded",
  "progress": "100%",
  "output": {
    "audio_url": "https://example.com/speech.mp3"
  },
  "error": null
}

Voice Management

List available voices:

curl "{BASE_URL}/v1/audio/voices" \
  -H "Authorization: Bearer sk-***"

Clone a voice from reference audio:

curl -X POST "{BASE_URL}/v1/audio/voices/clone" \
  -H "Authorization: Bearer sk-***" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "brand-narrator",
    "audio_url": "https://example.com/reference.wav"
  }'

Design a voice from text:

curl -X POST "{BASE_URL}/v1/audio/voices/design" \
  -H "Authorization: Bearer sk-***" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "warm-host",
    "description": "Warm, clear, young adult narrator for product explainers"
  }'

Use the returned voice_id in later speech tasks.

Billing Notes

Speech generation is billed by generated audio usage. Voice cloning and voice design may be billed separately because they create reusable private voices. Exact prices should come from the current product pricing surface.

Common Errors

Missing text or voice_id.
Passing a private voice_id that is not visible to the current account.
Submitting very long text without splitting it into smaller tasks.
Using an unsupported output format.
Insufficient balance or disabled API key.

When To Use It​

Supported Models​

Endpoint​

Authentication​

Speech Request Parameters​

Speech Request Example​

Submit Response​

Query Task Status​

Voice Management​

Billing Notes​

Common Errors​

Related Pages​