MiniMax Speech Voice Management
This document explains how to use MiniMax voiceCapability, voice list, Voice CloningandVoice Design.
If you only need to generate speech from text, usespeech generation endpoint:
POST /v1/audio/tasksGET /v1/audio/tasks/{task_id}
If you want to create your own voice_id, then use that voice_id in speech generation, use this page voice endpoints.
1. supportsCapability
| Capability | Endpoint | Whether | Note |
|---|---|---|---|
| List Voices | GET /v1/audio/voices | No | return voiceand privatevoice |
| Voice Cloning | POST /v1/audio/voices/clone | Yes | reference audiogenerate a voice_id |
| Voice Design | POST /v1/audio/voices/design | Yes | description voice_id |
| usevoicegenerationspeech | POST /v1/audio/tasks | Yes | voice_id pass TTS Modelgenerate audio |
recommended TTS Modeluse:
| Model name | Note |
|---|---|
minimax-speech-2.8-hd | new speech generation model, video, digital human, ad voiceover |
minimax-speech-02-hd | stable speech generation model, audiobook, text |
Voice CloningandVoice Designcreate voice_id TTS Modeluse.
2. EndpointURL
Example use:
https:
call aijisu API.
| Operation | Method | Path |
|---|---|---|
| List Voices | GET | /v1/audio/voices |
| Voice Cloning | POST | /v1/audio/voices/clone |
| Voice Design | POST | /v1/audio/voices/design |
| submit speech generationtask | POST | /v1/audio/tasks |
| query speech generationtask | GET | /v1/audio/tasks/{task_id} |
3. Authentication
Endpoint use Bearer Token:
Authorization: Bearer YOUR_API_KEY
Example:
curl https://api.xxx.xx/v1/audio/voices \
-H "Authorization: Bearer YOUR_API_KEY"
4. voice ID Note
voice_id Yescall TTS use voice.
voice:
| Note | |
|---|---|
| system voices | the platform voice, user queryanduse |
| privatevoice | Voice CloningorVoice Designcreate voice, and |
Voice CloningandVoice DesignSuccess, Endpoint return voice_id. to change, /v1/audio/tasks pass in voice_id generationspeech.
5. List Voices
5.1 Details
GET /v1/audio/voices
Optional valuesqueryParameters:
| Parameters | Type | Required | Note |
|---|---|---|---|
model | string | No | compatibilityModel voice, Optional values minimax-speech-2.8-hd or minimax-speech-02-hd |
5.2 queryall voice
curl "https://api.xxx.xx/v1/audio/voices" \
-H "Authorization: Bearer YOUR_API_KEY"
5.3 query Speech 2.8 HD voice
curl "https://api.xxx.xx/v1/audio/voices?model=minimax-speech-2.8-hd" \
-H "Authorization: Bearer YOUR_API_KEY"
5.4 query Speech 02 HD voice
curl "https://api.xxx.xx/v1/audio/voices?model=minimax-speech-02-hd" \
-H "Authorization: Bearer YOUR_API_KEY"
5.5 Response Examples
{
"object": "audio.voice.list",
"model": "minimax-speech-2.8-hd",
"data": [
{
"voice_id": "Chinese (Mandarin)_Kind-hearted_Elder",
"display_name": "Kind-hearted Elder",
"language": "Chinese (Mandarin)",
"description": "MiniMax system voice",
"source_type": "system",
"visibility": "public",
"status": "active",
"compatible_models": [
"minimax-speech-2.8-hd",
"minimax-speech-02-hd"
],
"preview_audio_url": null,
"created_at": "2026-06-23T18:05:59Z"
},
{
"voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
"display_name": "local-design-taskid-smoke",
"language": "Chinese (Mandarin)",
"description": "Local smoke test for task id in sync response.",
"source_type": "voice_design",
"visibility": "private",
"status": "active",
"compatible_models": [
"minimax-speech-2.8-hd",
"minimax-speech-02-hd"
],
"preview_audio_url": "https://api.xxx.xx/media/preview.mp3",
"created_at": "2026-06-24T08:42:23Z"
}
]
}
5.6 Response Fields
| Field | Note |
|---|---|
object | audio.voice.list |
model | Model |
data | voicearray |
data[].voice_id | voice ID, TTS callpass value |
data[].display_name | voice name |
data[].language | voice language |
data[].description | voice description |
data[].source_type | voice, Yes system, voice_clone, voice_design |
data[].visibility | , Yes public or private |
data[].status | Status,active |
data[].compatible_models | use voice TTS Model |
data[].preview_audio_url | audio URL, |
data[].created_at | Created time |
6. Voice Cloning
Voice Cloning areference audio, create newprivate voice_id.
Suitable scenarios:
- digital humangeneration voice -, audiobook, video
- to change call TTS voice
6.1 Details
POST /v1/audio/voices/clone
JSON.
| Parameters | Type | Required | Note |
|---|---|---|---|
audio_url | string | Yes | reference audio URL, requiresEndpointservice |
text | string | recommended | reference audio textor text, qualityandgeneration |
preview_text | string | No | text compatibility field. pass both, recommendedand text |
display_name | string | No | voice name, list |
name | string | No | display_name compatibility field |
language | string | No | voice language,Chinese (Mandarin), English |
description | string | No | voice description |
noise_reduction | boolean | No | Whether |
need_volume_normalization | boolean | No | Whether |
accuracy | number/string | No | Parameters, the platform supportspass through |
6.2 reference audiorecommended
| recommended | |
|---|---|
| audio | , |
| audio | recommended 10, |
| quality | , music, |
| stable, do not or | |
| text | If audio, recommendedto change text |
| use |
6.3 Request Examples
curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"audio_url": "https://example.com/audio/reference-speaker.mp3",
"text": ". Method. ",
"display_name": "course-teacher-voice",
"language": "Chinese (Mandarin)"}'
6.4 and
curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"audio_url": "https://example.com/audio/noisy-reference.wav",
"text": " Yes Voice Cloning reference audio, voice stable. ",
"display_name": "cleaned-brand-speaker",
"language": "Chinese (Mandarin)",
"description": " voice",
"noise_reduction": true,
"need_volume_normalization": true}'
6.5 English Voice Cloning
curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"audio_url": "https://example.com/audio/english-host.mp3",
"text": "Welcome back to the show. Today we are going to explore a simple but powerful idea.",
"display_name": "english-podcast-host",
"language": "English",
"description": "Warm English podcast host voice"}'
6.6 Response Examples
{
"object": "audio.voice",
"model": "minimax-voice-clone",
"task_id": "task_abc123",
"voice_id": "VoiceClone123456",
"preview_audio_url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
"voice": {
"voice_id": "VoiceClone123456",
"display_name": "course-teacher-voice",
"language": "Chinese (Mandarin)",
"description": " voice",
"source_type": "voice_clone",
"visibility": "private",
"status": "active",
"compatible_models": [
"minimax-speech-2.8-hd",
"minimax-speech-02-hd"
],
"preview_audio_url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
"created_at": "2026-06-24T08:42:23Z"
},
"billing_contract": {
"billing_version": "media-v1",
"public_model": "minimax-voice-clone",
"operation": "audio.voice_clone",
"settlement_policy": "fixed_at_estimate",
"billing_stage": "final",
"facts": {
"voice_clones": 1,
"preview_characters": 36
}
},
"outputs": [
{
"url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
"type": "audio"
}
]
}
7. Voice Design
Voice Design descriptiongenerate anewprivate voice_id, requires passreference audio.
Suitable scenarios:
-
video voice
-
digital human
-
fast generation, voice
-
reference audio, description scenario
7.1 Details
POST /v1/audio/voices/design
JSON.
| Parameters | Type | Required | Note |
|---|---|---|---|
prompt | string | Yes | Voice design description, voice description, language, scenario |
preview_text | string | Yes | generation audio text |
text | string | No | preview_text compatibility field |
display_name | string | No | voice name, list |
name | string | No | display_name compatibility field |
language | string | No | voice language,Chinese (Mandarin), English |
description | string | No | voice description |
7.2 Prompt recommended
recommended prompt description:
| Example | |
|---|---|
| language | Chinese Mandarin, English, Cantonese |
| male, female | |
| young adult, middle-aged, elder | |
| warm, clear, soft, energetic, calm | |
| scenario | product demo, customer service, audiobook, game character |
| friendly, confident, gentle, dramatic | |
| slow, medium pace, lively |
7.3 Chinese
curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A warm, calm Chinese Mandarin female narrator voice for short product demos, clear diction, gentle confidence, studio quality.",
"preview_text": ", YesaVoice Design. newspeechCapability. ",
"display_name": "warm-product-narrator",
"language": "Chinese (Mandarin)",
"description": " andshort-video narration "}'
7.4 digital human voice
curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A confident Chinese Mandarin female digital human presenter voice, natural conversational tone, bright but not exaggerated, suitable for business explanation videos.",
"preview_text": ", Yes.. ",
"display_name": "digital-human-presenter",
"language": "Chinese (Mandarin)",
"description": "digital human voice"}'
7.5 customer-service broadcastvoice
curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A polite Chinese Mandarin customer service voice, patient, clear, stable, friendly, suitable for service notifications and call center messages.",
"preview_text": ", processingcompleted.. ",
"display_name": "customer-service-clear",
"language": "Chinese (Mandarin)",
"description": " and voice"}'
7.6 voice
curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A young Chinese Mandarin fantasy game character voice, playful, lively, slightly mysterious, expressive but clear.",
"preview_text": ".. ",
"display_name": "fantasy-guide-character",
"language": "Chinese (Mandarin)",
"description": " voice"}'
7.7 English voice
curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A modern English commercial voice, energetic, premium, confident, suitable for product launch ads and social media videos.",
"preview_text": "Meet the new way to create, edit, and publish your ideas in minutes.",
"display_name": "english-commercial-premium",
"language": "English",
"description": "English commercial voice for product ads"}'
7.8 audiobook voice
curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A mature Chinese Mandarin audiobook narrator voice, calm, steady, immersive, with clear pronunciation and comfortable pacing.",
"preview_text": ". ",
"display_name": "audiobook-calm-narrator",
"language": "Chinese (Mandarin)",
"description": "audiobook voice"}'
7.9 Response Examples
{
"object": "audio.voice",
"model": "minimax-voice-design",
"task_id": "task_437fb17536aa4ff7830ffb7a39f43a99",
"voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
"preview_audio_url": "https://api.xxx.xx/media/design-preview.mp3",
"voice": {
"voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
"display_name": "warm-product-narrator",
"language": "Chinese (Mandarin)",
"description": " andshort-video narration ",
"source_type": "voice_design",
"visibility": "private",
"status": "active",
"compatible_models": [
"minimax-speech-2.8-hd",
"minimax-speech-02-hd"
],
"preview_audio_url": "https://api.xxx.xx/media/design-preview.mp3",
"created_at": "2026-06-24T08:42:23Z"
},
"billing_contract": {
"billing_version": "media-v1",
"public_model": "minimax-voice-design",
"operation": "audio.voice_design",
"settlement_policy": "fixed_at_estimate",
"billing_stage": "final",
"facts": {
"voice_designs": 1,
"preview_characters": 12
}
},
"outputs": [
{
"url": "https://api.xxx.xx/media/design-preview.mp3",
"type": "audio"
}
]
}
8. voice_id speech generation
Voice CloningorVoice DesignSuccess, to changereturn voice_id /v1/audio/tasks generationspeech.
8.1 useVoice DesignvoicegenerationChinese
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": "., fast. ",
"voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
"speed": 1.0,
"response_format": "url"}'
8.2 useVoice Cloningvoicegenerationcourse explanation
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ".. ",
"voice_id": "VoiceClone123456",
"speed": 0.95,
"response_format": "url"}'
8.3 query speech generationtask
curl "https://api.xxx.xx/v1/audio/tasks/task_abc123" \
-H "Authorization: Bearer YOUR_API_KEY"
8.4 speech generationtaskCompleted Response Example
{
"object": "audio.generation.job",
"task_id": "task_abc123",
"model": "minimax-speech-2.8-hd",
"status": "completed",
"audio_url": "https://api.xxx.xx/media/output.mp3",
"result": {
"audio_url": "https://api.xxx.xx/media/output.mp3",
"outputs": [
"https://api.xxx.xx/media/output.mp3"
],
"audios": [
{
"url": "https://api.xxx.xx/media/output.mp3"
}
]
}
}
9. Billing Notes
billing aijisu control andaccount. Endpoint Billing basis.
| Capability | billing detail | billing detail |
|---|---|---|
| List Voices | billing | |
| Voice Cloning | + | voice_clones * 1.5 + preview_characters * 0.0003 |
| Voice Design | + | voice_designs * 3 + preview_characters * 0.00003 |
| speech generation | TTS Model billing detail Rules |
Note:
preview_charactersYes text Unicode.- Chinese, English, spaces, punctuation, line breaks, emoji.
- Yes UTF-8, Yes token.
- Voice CloningandVoice Design submit billing detail.
- If Failed, the platformFailed Rulesprocessing.
10. Node.js Example
10.1 Voice Design
const response = await fetch("https://api.xxx.xx/v1/audio/voices/design", {method: "POST",
headers: {"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"},
body: JSON.stringify({prompt: "A warm Chinese Mandarin female narrator voice, clear and calm.",
preview_text: ", YesVoice Design. ",
display_name: "node-design-voice",
language: "Chinese (Mandarin)"})});
const data = await response.json();
console.log(data.voice_id);
console.log(data.preview_audio_url);
10.2 generation voice_id createspeech task
const voiceId = "ttv-voice-2026062416421526-E4jmMP8B";
const response = await fetch("https://api.xxx.xx/v1/audio/tasks", {method: "POST",
headers: {"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"},
body: JSON.stringify({model: "minimax-speech-2.8-hd",
text: " Yesause voicegeneration speech. ",
voice_id: voiceId,
response_format: "url"})});
const task = await response.json();
console.log(task.task_id);
11. Python Example
11.1 Voice Cloning
import requests
api_key = "YOUR_API_KEY"
response = requests.post("https://api.xxx.xx/v1/audio/voices/clone",
headers={"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"},
json={"audio_url": "https://example.com/audio/reference-speaker.mp3",
"text": ". ",
"display_name": "python-clone-voice",
"language": "Chinese (Mandarin)",
"noise_reduction": True,
"need_volume_normalization": True},
timeout=180)
data = response.json()
print(data["voice_id"])
print(data.get("preview_audio_url"))
11.2 List Voices
import requests
api_key = "YOUR_API_KEY"
response = requests.get("https://api.xxx.xx/v1/audio/voices",
headers={"Authorization": f"Bearer {api_key}"},
params={"model": "minimax-speech-2.8-hd"},
timeout=30)
voices = response.json()["data"]
for voice in voices:
print(voice["voice_id"], voice.get("display_name"))
12. Common Errors
12.1 Authentication
{
"error": {
"message": "API key required.",
"type": "invalid_request_error",
"code": "api_key_required"
}
}
Fix: Request headersWhether Authorization: Bearer YOUR_API_KEY.
12.2 Voice Cloning audio_url
{
"error": {
"message": "`audio_url` is required.",
"type": "invalid_request_error",
"code": "invalid_request_parameter"
}
}
Fix: pass in reference audio URL.
12.3 Voice Design prompt
{
"error": {
"message": "`prompt` is required.",
"type": "invalid_request_error",
"code": "invalid_request_parameter"
}
}
Fix: description, language, scenario, and.
12.4 Voice Design preview_text
{
"error": {
"message": "`preview_text` is required.",
"type": "invalid_request_error",
"code": "invalid_request_parameter"
}
}
Fix: pass in generation audio text.
12.5 voice_id unavailable
{
"error": {
"message": "`voice_id` is not visible for the current client or is not compatible with this model",
"type": "invalid_request_error",
"code": "invalid_request_parameter"
}
}
voice_idError.- use privatevoice.
- voice compatibility TTS Model.
- voice orunavailable.
Fix:
- call
GET /v1/audio/voices?model=...query voice. - returnresult
voice_id. voice_id/v1/audio/tasks.
13. Details
13.1 Voice Cloning
reference audio, requires, useVoice Cloning.
scenario:
- audio.
- audio.
- TTS reference.
13.2 Voice Design
reference audio, description, useVoice Design.
scenario:
- digital human.
- video generation voice.
- generation.
- fast or voice.
13.3 voice recommended
recommended display_name usestable, name.
Example:
brand-female-presenter
course-teacher-male
customer-service-clear
game-guide-young
english-commercial-premium
13.4 voice_id
createSuccess, Save:
voice_iddisplay_namesource_typepreview_audio_urlcompatible_modelscreated_at
Yes voice_id. TTS call requirespass voice_id.
13.5 callrecommended
- Voice CloningandVoice Design TTS, recommended.
- do not API Key.
- recommended service call aijisu API.
voice_id, create voice.- create audioand.