Skip to main content
AI

MiniMax Speech Voice Management

This document explains how to use MiniMax voiceCapability, voice list, Voice CloningandVoice Design.

If you only need to generate speech from text, usespeech generation endpoint:

  • POST /v1/audio/tasks
  • GET /v1/audio/tasks/{task_id}

If you want to create your own voice_id, then use that voice_id in speech generation, use this page voice endpoints.

1. supportsCapability

CapabilityEndpointWhetherNote
List VoicesGET /v1/audio/voicesNoreturn voiceand privatevoice
Voice CloningPOST /v1/audio/voices/cloneYesreference audiogenerate a voice_id
Voice DesignPOST /v1/audio/voices/designYesdescription voice_id
usevoicegenerationspeechPOST /v1/audio/tasksYesvoice_id pass TTS Modelgenerate audio

recommended TTS Modeluse:

Model nameNote
minimax-speech-2.8-hdnew speech generation model, video, digital human, ad voiceover
minimax-speech-02-hdstable speech generation model, audiobook, text

Voice CloningandVoice Designcreate voice_id TTS Modeluse.

2. EndpointURL

Example use:

https://api.xxx.xx

call aijisu API.

OperationMethodPath
List VoicesGET/v1/audio/voices
Voice CloningPOST/v1/audio/voices/clone
Voice DesignPOST/v1/audio/voices/design
submit speech generationtaskPOST/v1/audio/tasks
query speech generationtaskGET/v1/audio/tasks/{task_id}

3. Authentication

Endpoint use Bearer Token:

Authorization: Bearer YOUR_API_KEY

Example:

curl https://api.xxx.xx/v1/audio/voices \
-H "Authorization: Bearer YOUR_API_KEY"

4. voice ID Note

voice_id Yescall TTS use voice.

voice:

Note
system voicesthe platform voice, user queryanduse
privatevoiceVoice CloningorVoice Designcreate voice, and

Voice CloningandVoice DesignSuccess, Endpoint return voice_id. to change, /v1/audio/tasks pass in voice_id generationspeech.

5. List Voices

5.1 Details

GET /v1/audio/voices

Optional valuesqueryParameters:

ParametersTypeRequiredNote
modelstringNocompatibilityModel voice, Optional values minimax-speech-2.8-hd or minimax-speech-02-hd

5.2 queryall voice

curl "https://api.xxx.xx/v1/audio/voices" \
-H "Authorization: Bearer YOUR_API_KEY"

5.3 query Speech 2.8 HD voice

curl "https://api.xxx.xx/v1/audio/voices?model=minimax-speech-2.8-hd" \
-H "Authorization: Bearer YOUR_API_KEY"

5.4 query Speech 02 HD voice

curl "https://api.xxx.xx/v1/audio/voices?model=minimax-speech-02-hd" \
-H "Authorization: Bearer YOUR_API_KEY"

5.5 Response Examples

{
"object": "audio.voice.list",
"model": "minimax-speech-2.8-hd",
"data": [
{
"voice_id": "Chinese (Mandarin)_Kind-hearted_Elder",
"display_name": "Kind-hearted Elder",
"language": "Chinese (Mandarin)",
"description": "MiniMax system voice",
"source_type": "system",
"visibility": "public",
"status": "active",
"compatible_models": [
"minimax-speech-2.8-hd",
"minimax-speech-02-hd"
],
"preview_audio_url": null,
"created_at": "2026-06-23T18:05:59Z"
},
{
"voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
"display_name": "local-design-taskid-smoke",
"language": "Chinese (Mandarin)",
"description": "Local smoke test for task id in sync response.",
"source_type": "voice_design",
"visibility": "private",
"status": "active",
"compatible_models": [
"minimax-speech-2.8-hd",
"minimax-speech-02-hd"
],
"preview_audio_url": "https://api.xxx.xx/media/preview.mp3",
"created_at": "2026-06-24T08:42:23Z"
}
]
}

5.6 Response Fields

FieldNote
objectaudio.voice.list
modelModel
datavoicearray
data[].voice_idvoice ID, TTS callpass value
data[].display_namevoice name
data[].languagevoice language
data[].descriptionvoice description
data[].source_typevoice, Yes system, voice_clone, voice_design
data[].visibility, Yes public or private
data[].statusStatus,active
data[].compatible_modelsuse voice TTS Model
data[].preview_audio_urlaudio URL,
data[].created_atCreated time

6. Voice Cloning

Voice Cloning areference audio, create newprivate voice_id.

Suitable scenarios:

  • digital humangeneration voice -, audiobook, video
  • to change call TTS voice

6.1 Details

POST /v1/audio/voices/clone

JSON.

ParametersTypeRequiredNote
audio_urlstringYesreference audio URL, requiresEndpointservice
textstringrecommendedreference audio textor text, qualityandgeneration
preview_textstringNotext compatibility field. pass both, recommendedand text
display_namestringNovoice name, list
namestringNodisplay_name compatibility field
languagestringNovoice language,Chinese (Mandarin), English
descriptionstringNovoice description
noise_reductionbooleanNoWhether
need_volume_normalizationbooleanNoWhether
accuracynumber/stringNoParameters, the platform supportspass through

6.2 reference audiorecommended

recommended
audio,
audiorecommended 10,
quality, music,
stable, do not or
textIf audio, recommendedto change text
use

6.3 Request Examples

curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"audio_url": "https://example.com/audio/reference-speaker.mp3",
"text": ". Method. ",
"display_name": "course-teacher-voice",
"language": "Chinese (Mandarin)"}'

6.4 and

curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"audio_url": "https://example.com/audio/noisy-reference.wav",
"text": " Yes Voice Cloning reference audio, voice stable. ",
"display_name": "cleaned-brand-speaker",
"language": "Chinese (Mandarin)",
"description": " voice",
"noise_reduction": true,
"need_volume_normalization": true}'

6.5 English Voice Cloning

curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"audio_url": "https://example.com/audio/english-host.mp3",
"text": "Welcome back to the show. Today we are going to explore a simple but powerful idea.",
"display_name": "english-podcast-host",
"language": "English",
"description": "Warm English podcast host voice"}'

6.6 Response Examples

{
"object": "audio.voice",
"model": "minimax-voice-clone",
"task_id": "task_abc123",
"voice_id": "VoiceClone123456",
"preview_audio_url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
"voice": {
"voice_id": "VoiceClone123456",
"display_name": "course-teacher-voice",
"language": "Chinese (Mandarin)",
"description": " voice",
"source_type": "voice_clone",
"visibility": "private",
"status": "active",
"compatible_models": [
"minimax-speech-2.8-hd",
"minimax-speech-02-hd"
],
"preview_audio_url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
"created_at": "2026-06-24T08:42:23Z"
},
"billing_contract": {
"billing_version": "media-v1",
"public_model": "minimax-voice-clone",
"operation": "audio.voice_clone",
"settlement_policy": "fixed_at_estimate",
"billing_stage": "final",
"facts": {
"voice_clones": 1,
"preview_characters": 36
}
},
"outputs": [
{
"url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
"type": "audio"
}
]
}

7. Voice Design

Voice Design descriptiongenerate anewprivate voice_id, requires passreference audio.

Suitable scenarios:

  • video voice

  • digital human

  • fast generation, voice

  • reference audio, description scenario

7.1 Details

POST /v1/audio/voices/design

JSON.

ParametersTypeRequiredNote
promptstringYesVoice design description, voice description, language, scenario
preview_textstringYesgeneration audio text
textstringNopreview_text compatibility field
display_namestringNovoice name, list
namestringNodisplay_name compatibility field
languagestringNovoice language,Chinese (Mandarin), English
descriptionstringNovoice description

recommended prompt description:

Example
languageChinese Mandarin, English, Cantonese
male, female
young adult, middle-aged, elder
warm, clear, soft, energetic, calm
scenarioproduct demo, customer service, audiobook, game character
friendly, confident, gentle, dramatic
slow, medium pace, lively

7.3 Chinese

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A warm, calm Chinese Mandarin female narrator voice for short product demos, clear diction, gentle confidence, studio quality.",
"preview_text": ", YesaVoice Design. newspeechCapability. ",
"display_name": "warm-product-narrator",
"language": "Chinese (Mandarin)",
"description": " andshort-video narration "}'

7.4 digital human voice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A confident Chinese Mandarin female digital human presenter voice, natural conversational tone, bright but not exaggerated, suitable for business explanation videos.",
"preview_text": ", Yes.. ",
"display_name": "digital-human-presenter",
"language": "Chinese (Mandarin)",
"description": "digital human voice"}'

7.5 customer-service broadcastvoice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A polite Chinese Mandarin customer service voice, patient, clear, stable, friendly, suitable for service notifications and call center messages.",
"preview_text": ", processingcompleted.. ",
"display_name": "customer-service-clear",
"language": "Chinese (Mandarin)",
"description": " and voice"}'

7.6 voice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A young Chinese Mandarin fantasy game character voice, playful, lively, slightly mysterious, expressive but clear.",
"preview_text": ".. ",
"display_name": "fantasy-guide-character",
"language": "Chinese (Mandarin)",
"description": " voice"}'

7.7 English voice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A modern English commercial voice, energetic, premium, confident, suitable for product launch ads and social media videos.",
"preview_text": "Meet the new way to create, edit, and publish your ideas in minutes.",
"display_name": "english-commercial-premium",
"language": "English",
"description": "English commercial voice for product ads"}'

7.8 audiobook voice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "A mature Chinese Mandarin audiobook narrator voice, calm, steady, immersive, with clear pronunciation and comfortable pacing.",
"preview_text": ". ",
"display_name": "audiobook-calm-narrator",
"language": "Chinese (Mandarin)",
"description": "audiobook voice"}'

7.9 Response Examples

{
"object": "audio.voice",
"model": "minimax-voice-design",
"task_id": "task_437fb17536aa4ff7830ffb7a39f43a99",
"voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
"preview_audio_url": "https://api.xxx.xx/media/design-preview.mp3",
"voice": {
"voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
"display_name": "warm-product-narrator",
"language": "Chinese (Mandarin)",
"description": " andshort-video narration ",
"source_type": "voice_design",
"visibility": "private",
"status": "active",
"compatible_models": [
"minimax-speech-2.8-hd",
"minimax-speech-02-hd"
],
"preview_audio_url": "https://api.xxx.xx/media/design-preview.mp3",
"created_at": "2026-06-24T08:42:23Z"
},
"billing_contract": {
"billing_version": "media-v1",
"public_model": "minimax-voice-design",
"operation": "audio.voice_design",
"settlement_policy": "fixed_at_estimate",
"billing_stage": "final",
"facts": {
"voice_designs": 1,
"preview_characters": 12
}
},
"outputs": [
{
"url": "https://api.xxx.xx/media/design-preview.mp3",
"type": "audio"
}
]
}

8. voice_id speech generation

Voice CloningorVoice DesignSuccess, to changereturn voice_id /v1/audio/tasks generationspeech.

8.1 useVoice DesignvoicegenerationChinese

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": "., fast. ",
"voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
"speed": 1.0,
"response_format": "url"}'

8.2 useVoice Cloningvoicegenerationcourse explanation

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ".. ",
"voice_id": "VoiceClone123456",
"speed": 0.95,
"response_format": "url"}'

8.3 query speech generationtask

curl "https://api.xxx.xx/v1/audio/tasks/task_abc123" \
-H "Authorization: Bearer YOUR_API_KEY"

8.4 speech generationtaskCompleted Response Example

{
"object": "audio.generation.job",
"task_id": "task_abc123",
"model": "minimax-speech-2.8-hd",
"status": "completed",
"audio_url": "https://api.xxx.xx/media/output.mp3",
"result": {
"audio_url": "https://api.xxx.xx/media/output.mp3",
"outputs": [
"https://api.xxx.xx/media/output.mp3"
],
"audios": [
{
"url": "https://api.xxx.xx/media/output.mp3"
}
]
}
}

9. Billing Notes

billing aijisu control andaccount. Endpoint Billing basis.

Capabilitybilling detailbilling detail
List Voicesbilling
Voice Cloning+voice_clones * 1.5 + preview_characters * 0.0003
Voice Design+voice_designs * 3 + preview_characters * 0.00003
speech generationTTS Model billing detail Rules

Note:

  • preview_characters Yes text Unicode.
  • Chinese, English, spaces, punctuation, line breaks, emoji.
  • Yes UTF-8, Yes token.
  • Voice CloningandVoice Design submit billing detail.
  • If Failed, the platformFailed Rulesprocessing.

10. Node.js Example

10.1 Voice Design

const response = await fetch("https://api.xxx.xx/v1/audio/voices/design", {method: "POST",
headers: {"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"},
body: JSON.stringify({prompt: "A warm Chinese Mandarin female narrator voice, clear and calm.",
preview_text: ", YesVoice Design. ",
display_name: "node-design-voice",
language: "Chinese (Mandarin)"})});

const data = await response.json();
console.log(data.voice_id);
console.log(data.preview_audio_url);

10.2 generation voice_id createspeech task

const voiceId = "ttv-voice-2026062416421526-E4jmMP8B";

const response = await fetch("https://api.xxx.xx/v1/audio/tasks", {method: "POST",
headers: {"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"},
body: JSON.stringify({model: "minimax-speech-2.8-hd",
text: " Yesause voicegeneration speech. ",
voice_id: voiceId,
response_format: "url"})});

const task = await response.json();
console.log(task.task_id);

11. Python Example

11.1 Voice Cloning

import requests

api_key = "YOUR_API_KEY"

response = requests.post("https://api.xxx.xx/v1/audio/voices/clone",
headers={"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"},
json={"audio_url": "https://example.com/audio/reference-speaker.mp3",
"text": ". ",
"display_name": "python-clone-voice",
"language": "Chinese (Mandarin)",
"noise_reduction": True,
"need_volume_normalization": True},
timeout=180)

data = response.json()
print(data["voice_id"])
print(data.get("preview_audio_url"))

11.2 List Voices

import requests

api_key = "YOUR_API_KEY"

response = requests.get("https://api.xxx.xx/v1/audio/voices",
headers={"Authorization": f"Bearer {api_key}"},
params={"model": "minimax-speech-2.8-hd"},
timeout=30)

voices = response.json()["data"]
for voice in voices:
print(voice["voice_id"], voice.get("display_name"))

12. Common Errors

12.1 Authentication

{
"error": {
"message": "API key required.",
"type": "invalid_request_error",
"code": "api_key_required"
}
}

Fix: Request headersWhether Authorization: Bearer YOUR_API_KEY.

12.2 Voice Cloning audio_url

{
"error": {
"message": "`audio_url` is required.",
"type": "invalid_request_error",
"code": "invalid_request_parameter"
}
}

Fix: pass in reference audio URL.

12.3 Voice Design prompt

{
"error": {
"message": "`prompt` is required.",
"type": "invalid_request_error",
"code": "invalid_request_parameter"
}
}

Fix: description, language, scenario, and.

12.4 Voice Design preview_text

{
"error": {
"message": "`preview_text` is required.",
"type": "invalid_request_error",
"code": "invalid_request_parameter"
}
}

Fix: pass in generation audio text.

12.5 voice_id unavailable

{
"error": {
"message": "`voice_id` is not visible for the current client or is not compatible with this model",
"type": "invalid_request_error",
"code": "invalid_request_parameter"
}
}
  • voice_id Error.
  • use privatevoice.
  • voice compatibility TTS Model.
  • voice orunavailable.

Fix:

  1. call GET /v1/audio/voices?model=... query voice.
  2. returnresult voice_id.
  3. voice_id /v1/audio/tasks.

13. Details

13.1 Voice Cloning

reference audio, requires, useVoice Cloning.

scenario:

  • audio.
  • audio.
  • TTS reference.

13.2 Voice Design

reference audio, description, useVoice Design.

scenario:

  • digital human.
  • video generation voice.
  • generation.
  • fast or voice.

recommended display_name usestable, name.

Example:

brand-female-presenter
course-teacher-male
customer-service-clear
game-guide-young
english-commercial-premium

13.4 voice_id

createSuccess, Save:

  • voice_id
  • display_name
  • source_type
  • preview_audio_url
  • compatible_models
  • created_at

Yes voice_id. TTS call requirespass voice_id.

13.5 callrecommended

  • Voice CloningandVoice Design TTS, recommended.
  • do not API Key.
  • recommended service call aijisu API.
  • voice_id, create voice.
  • create audioand.