MiniMax Speech Voice Management

This document explains how to use MiniMax voiceCapability, voice list, Voice CloningandVoice Design.

If you only need to generate speech from text, usespeech generation endpoint:

POST /v1/audio/tasks
GET /v1/audio/tasks/{task_id}

If you want to create your own voice_id, then use that voice_id in speech generation, use this page voice endpoints.

1. supportsCapability

Capability	Endpoint	Whether	Note
List Voices	`GET /v1/audio/voices`	No	return voiceand privatevoice
Voice Cloning	`POST /v1/audio/voices/clone`	Yes	reference audiogenerate a `voice_id`
Voice Design	`POST /v1/audio/voices/design`	Yes	description `voice_id`
usevoicegenerationspeech	`POST /v1/audio/tasks`	Yes	`voice_id` pass TTS Modelgenerate audio

recommended TTS Modeluse:

Model name	Note
`minimax-speech-2.8-hd`	new speech generation model, video, digital human, ad voiceover
`minimax-speech-02-hd`	stable speech generation model, audiobook, text

Voice CloningandVoice Designcreate voice_id TTS Modeluse.

2. EndpointURL

Example use:

https://api.xxx.xx

call aijisu API.

Operation	Method	Path
List Voices	`GET`	`/v1/audio/voices`
Voice Cloning	`POST`	`/v1/audio/voices/clone`
Voice Design	`POST`	`/v1/audio/voices/design`
submit speech generationtask	`POST`	`/v1/audio/tasks`
query speech generationtask	`GET`	`/v1/audio/tasks/{task_id}`

3. Authentication

Endpoint use Bearer Token:

Authorization: Bearer YOUR_API_KEY

Example:

curl https://api.xxx.xx/v1/audio/voices \
 -H "Authorization: Bearer YOUR_API_KEY"

4. voice ID Note

voice_id Yescall TTS use voice.

voice:

	Note
system voices	the platform voice, user queryanduse
privatevoice	Voice CloningorVoice Designcreate voice, and

Voice CloningandVoice DesignSuccess, Endpoint return voice_id. to change, /v1/audio/tasks pass in voice_id generationspeech.

5. List Voices

5.1 Details

GET /v1/audio/voices

Optional valuesqueryParameters:

Parameters	Type	Required	Note
`model`	string	No	compatibilityModel voice, Optional values `minimax-speech-2.8-hd` or `minimax-speech-02-hd`

5.2 queryall voice

curl "https://api.xxx.xx/v1/audio/voices" \
 -H "Authorization: Bearer YOUR_API_KEY"

5.3 query Speech 2.8 HD voice

curl "https://api.xxx.xx/v1/audio/voices?model=minimax-speech-2.8-hd" \
 -H "Authorization: Bearer YOUR_API_KEY"

5.4 query Speech 02 HD voice

curl "https://api.xxx.xx/v1/audio/voices?model=minimax-speech-02-hd" \
 -H "Authorization: Bearer YOUR_API_KEY"

5.5 Response Examples

{
  "object": "audio.voice.list",
  "model": "minimax-speech-2.8-hd",
  "data": [
    {
      "voice_id": "Chinese (Mandarin)_Kind-hearted_Elder",
      "display_name": "Kind-hearted Elder",
      "language": "Chinese (Mandarin)",
      "description": "MiniMax system voice",
      "source_type": "system",
      "visibility": "public",
      "status": "active",
      "compatible_models": [
        "minimax-speech-2.8-hd",
        "minimax-speech-02-hd"
      ],
      "preview_audio_url": null,
      "created_at": "2026-06-23T18:05:59Z"
    },
    {
      "voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
      "display_name": "local-design-taskid-smoke",
      "language": "Chinese (Mandarin)",
      "description": "Local smoke test for task id in sync response.",
      "source_type": "voice_design",
      "visibility": "private",
      "status": "active",
      "compatible_models": [
        "minimax-speech-2.8-hd",
        "minimax-speech-02-hd"
      ],
      "preview_audio_url": "https://api.xxx.xx/media/preview.mp3",
      "created_at": "2026-06-24T08:42:23Z"
    }
  ]
}

5.6 Response Fields

Field	Note
`object`	`audio.voice.list`
`model`	Model
`data`	voicearray
`data[].voice_id`	voice ID, TTS callpass value
`data[].display_name`	voice name
`data[].language`	voice language
`data[].description`	voice description
`data[].source_type`	voice, Yes `system`, `voice_clone`, `voice_design`
`data[].visibility`	, Yes `public` or `private`
`data[].status`	Status,`active`
`data[].compatible_models`	use voice TTS Model
`data[].preview_audio_url`	audio URL,
`data[].created_at`	Created time

6. Voice Cloning

Voice Cloning areference audio, create newprivate voice_id.

Suitable scenarios:

digital humangeneration voice -, audiobook, video
to change call TTS voice

6.1 Details

POST /v1/audio/voices/clone

JSON.

Parameters	Type	Required	Note
`audio_url`	string	Yes	reference audio URL, requiresEndpointservice
`text`	string	recommended	reference audio textor text, qualityandgeneration
`preview_text`	string	No	`text` compatibility field. pass both, recommendedand `text`
`display_name`	string	No	voice name, list
`name`	string	No	`display_name` compatibility field
`language`	string	No	voice language,`Chinese (Mandarin)`, `English`
`description`	string	No	voice description
`noise_reduction`	boolean	No	Whether
`need_volume_normalization`	boolean	No	Whether
`accuracy`	number/string	No	Parameters, the platform supportspass through

6.2 reference audiorecommended

	recommended
audio	,
audio	recommended 10,
quality	, music,
	stable, do not or
text	If audio, recommendedto change `text`
	use

6.3 Request Examples

curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"audio_url": "https://example.com/audio/reference-speaker.mp3",
 "text": ". Method. ",
 "display_name": "course-teacher-voice",
 "language": "Chinese (Mandarin)"}'

6.4 and

curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"audio_url": "https://example.com/audio/noisy-reference.wav",
 "text": " Yes Voice Cloning reference audio, voice stable. ",
 "display_name": "cleaned-brand-speaker",
 "language": "Chinese (Mandarin)",
 "description": " voice",
 "noise_reduction": true,
 "need_volume_normalization": true}'

6.5 English Voice Cloning

curl -X POST "https://api.xxx.xx/v1/audio/voices/clone" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"audio_url": "https://example.com/audio/english-host.mp3",
 "text": "Welcome back to the show. Today we are going to explore a simple but powerful idea.",
 "display_name": "english-podcast-host",
 "language": "English",
 "description": "Warm English podcast host voice"}'

6.6 Response Examples

{
  "object": "audio.voice",
  "model": "minimax-voice-clone",
  "task_id": "task_abc123",
  "voice_id": "VoiceClone123456",
  "preview_audio_url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
  "voice": {
    "voice_id": "VoiceClone123456",
    "display_name": "course-teacher-voice",
    "language": "Chinese (Mandarin)",
    "description": " voice",
    "source_type": "voice_clone",
    "visibility": "private",
    "status": "active",
    "compatible_models": [
      "minimax-speech-2.8-hd",
      "minimax-speech-02-hd"
    ],
    "preview_audio_url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
    "created_at": "2026-06-24T08:42:23Z"
  },
  "billing_contract": {
    "billing_version": "media-v1",
    "public_model": "minimax-voice-clone",
    "operation": "audio.voice_clone",
    "settlement_policy": "fixed_at_estimate",
    "billing_stage": "final",
    "facts": {
      "voice_clones": 1,
      "preview_characters": 36
    }
  },
  "outputs": [
    {
      "url": "https://api.xxx.xx/media/voice-clone-preview.mp3",
      "type": "audio"
    }
  ]
}

7. Voice Design

Voice Design descriptiongenerate anewprivate voice_id, requires passreference audio.

Suitable scenarios:

video voice
digital human
fast generation, voice
reference audio, description scenario

7.1 Details

POST /v1/audio/voices/design

JSON.

Parameters	Type	Required	Note
`prompt`	string	Yes	Voice design description, voice description, language, scenario
`preview_text`	string	Yes	generation audio text
`text`	string	No	`preview_text` compatibility field
`display_name`	string	No	voice name, list
`name`	string	No	`display_name` compatibility field
`language`	string	No	voice language,`Chinese (Mandarin)`, `English`
`description`	string	No	voice description

7.2 Prompt recommended

recommended prompt description:

	Example
language	Chinese Mandarin, English, Cantonese
	male, female
	young adult, middle-aged, elder
	warm, clear, soft, energetic, calm
scenario	product demo, customer service, audiobook, game character
	friendly, confident, gentle, dramatic
	slow, medium pace, lively

7.3 Chinese

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"prompt": "A warm, calm Chinese Mandarin female narrator voice for short product demos, clear diction, gentle confidence, studio quality.",
 "preview_text": ", YesaVoice Design. newspeechCapability. ",
 "display_name": "warm-product-narrator",
 "language": "Chinese (Mandarin)",
 "description": " andshort-video narration "}'

7.4 digital human voice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"prompt": "A confident Chinese Mandarin female digital human presenter voice, natural conversational tone, bright but not exaggerated, suitable for business explanation videos.",
 "preview_text": ", Yes.. ",
 "display_name": "digital-human-presenter",
 "language": "Chinese (Mandarin)",
 "description": "digital human voice"}'

7.5 customer-service broadcastvoice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"prompt": "A polite Chinese Mandarin customer service voice, patient, clear, stable, friendly, suitable for service notifications and call center messages.",
 "preview_text": ", processingcompleted.. ",
 "display_name": "customer-service-clear",
 "language": "Chinese (Mandarin)",
 "description": " and voice"}'

7.6 voice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"prompt": "A young Chinese Mandarin fantasy game character voice, playful, lively, slightly mysterious, expressive but clear.",
 "preview_text": ".. ",
 "display_name": "fantasy-guide-character",
 "language": "Chinese (Mandarin)",
 "description": " voice"}'

7.7 English voice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"prompt": "A modern English commercial voice, energetic, premium, confident, suitable for product launch ads and social media videos.",
 "preview_text": "Meet the new way to create, edit, and publish your ideas in minutes.",
 "display_name": "english-commercial-premium",
 "language": "English",
 "description": "English commercial voice for product ads"}'

7.8 audiobook voice

curl -X POST "https://api.xxx.xx/v1/audio/voices/design" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"prompt": "A mature Chinese Mandarin audiobook narrator voice, calm, steady, immersive, with clear pronunciation and comfortable pacing.",
 "preview_text": ". ",
 "display_name": "audiobook-calm-narrator",
 "language": "Chinese (Mandarin)",
 "description": "audiobook voice"}'

7.9 Response Examples

{
  "object": "audio.voice",
  "model": "minimax-voice-design",
  "task_id": "task_437fb17536aa4ff7830ffb7a39f43a99",
  "voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
  "preview_audio_url": "https://api.xxx.xx/media/design-preview.mp3",
  "voice": {
    "voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
    "display_name": "warm-product-narrator",
    "language": "Chinese (Mandarin)",
    "description": " andshort-video narration ",
    "source_type": "voice_design",
    "visibility": "private",
    "status": "active",
    "compatible_models": [
      "minimax-speech-2.8-hd",
      "minimax-speech-02-hd"
    ],
    "preview_audio_url": "https://api.xxx.xx/media/design-preview.mp3",
    "created_at": "2026-06-24T08:42:23Z"
  },
  "billing_contract": {
    "billing_version": "media-v1",
    "public_model": "minimax-voice-design",
    "operation": "audio.voice_design",
    "settlement_policy": "fixed_at_estimate",
    "billing_stage": "final",
    "facts": {
      "voice_designs": 1,
      "preview_characters": 12
    }
  },
  "outputs": [
    {
      "url": "https://api.xxx.xx/media/design-preview.mp3",
      "type": "audio"
    }
  ]
}

8. voice_id speech generation

Voice CloningorVoice DesignSuccess, to changereturn voice_id /v1/audio/tasks generationspeech.

8.1 useVoice DesignvoicegenerationChinese

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"model": "minimax-speech-2.8-hd",
 "text": "., fast. ",
 "voice_id": "ttv-voice-2026062416421526-E4jmMP8B",
 "speed": 1.0,
 "response_format": "url"}'

8.2 useVoice Cloningvoicegenerationcourse explanation

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
 -H "Authorization: Bearer YOUR_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{"model": "minimax-speech-02-hd",
 "text": ".. ",
 "voice_id": "VoiceClone123456",
 "speed": 0.95,
 "response_format": "url"}'

8.3 query speech generationtask

curl "https://api.xxx.xx/v1/audio/tasks/task_abc123" \
 -H "Authorization: Bearer YOUR_API_KEY"

8.4 speech generationtaskCompleted Response Example

{
  "object": "audio.generation.job",
  "task_id": "task_abc123",
  "model": "minimax-speech-2.8-hd",
  "status": "completed",
  "audio_url": "https://api.xxx.xx/media/output.mp3",
  "result": {
    "audio_url": "https://api.xxx.xx/media/output.mp3",
    "outputs": [
      "https://api.xxx.xx/media/output.mp3"
    ],
    "audios": [
      {
        "url": "https://api.xxx.xx/media/output.mp3"
      }
    ]
  }
}

9. Billing Notes

billing aijisu control andaccount. Endpoint Billing basis.

Capability	billing detail	billing detail
List Voices		billing
Voice Cloning	+	`voice_clones * 1.5 + preview_characters * 0.0003`
Voice Design	+	`voice_designs * 3 + preview_characters * 0.00003`
speech generation		TTS Model billing detail Rules

Note:

preview_characters Yes text Unicode.
Chinese, English, spaces, punctuation, line breaks, emoji.
Yes UTF-8, Yes token.
Voice CloningandVoice Design submit billing detail.
If Failed, the platformFailed Rulesprocessing.

10. Node.js Example

10.1 Voice Design

const response = await fetch("https://api.xxx.xx/v1/audio/voices/design", {method: "POST",
 headers: {"Authorization": "Bearer YOUR_API_KEY",
 "Content-Type": "application/json"},
 body: JSON.stringify({prompt: "A warm Chinese Mandarin female narrator voice, clear and calm.",
 preview_text: ", YesVoice Design. ",
 display_name: "node-design-voice",
 language: "Chinese (Mandarin)"})});

const data = await response.json();
console.log(data.voice_id);
console.log(data.preview_audio_url);

10.2 generation voice_id createspeech task

const voiceId = "ttv-voice-2026062416421526-E4jmMP8B";

const response = await fetch("https://api.xxx.xx/v1/audio/tasks", {method: "POST",
 headers: {"Authorization": "Bearer YOUR_API_KEY",
 "Content-Type": "application/json"},
 body: JSON.stringify({model: "minimax-speech-2.8-hd",
 text: " Yesause voicegeneration speech. ",
 voice_id: voiceId,
 response_format: "url"})});

const task = await response.json();
console.log(task.task_id);

11. Python Example

11.1 Voice Cloning

import requests

api_key = "YOUR_API_KEY"

response = requests.post("https://api.xxx.xx/v1/audio/voices/clone",
 headers={"Authorization": f"Bearer {api_key}",
 "Content-Type": "application/json"},
 json={"audio_url": "https://example.com/audio/reference-speaker.mp3",
 "text": ". ",
 "display_name": "python-clone-voice",
 "language": "Chinese (Mandarin)",
 "noise_reduction": True,
 "need_volume_normalization": True},
 timeout=180)

data = response.json()
print(data["voice_id"])
print(data.get("preview_audio_url"))

11.2 List Voices

import requests

api_key = "YOUR_API_KEY"

response = requests.get("https://api.xxx.xx/v1/audio/voices",
 headers={"Authorization": f"Bearer {api_key}"},
 params={"model": "minimax-speech-2.8-hd"},
 timeout=30)

voices = response.json()["data"]
for voice in voices:
 print(voice["voice_id"], voice.get("display_name"))

12. Common Errors

12.1 Authentication

{
  "error": {
    "message": "API key required.",
    "type": "invalid_request_error",
    "code": "api_key_required"
  }
}

Fix: Request headersWhether Authorization: Bearer YOUR_API_KEY.

12.2 Voice Cloning audio_url

{
  "error": {
    "message": "`audio_url` is required.",
    "type": "invalid_request_error",
    "code": "invalid_request_parameter"
  }
}

Fix: pass in reference audio URL.

12.3 Voice Design prompt

{
  "error": {
    "message": "`prompt` is required.",
    "type": "invalid_request_error",
    "code": "invalid_request_parameter"
  }
}

Fix: description, language, scenario, and.

12.4 Voice Design preview_text

{
  "error": {
    "message": "`preview_text` is required.",
    "type": "invalid_request_error",
    "code": "invalid_request_parameter"
  }
}

Fix: pass in generation audio text.

12.5 voice_id unavailable

{
  "error": {
    "message": "`voice_id` is not visible for the current client or is not compatible with this model",
    "type": "invalid_request_error",
    "code": "invalid_request_parameter"
  }
}

voice_id Error.
use privatevoice.
voice compatibility TTS Model.
voice orunavailable.

Fix:

call GET /v1/audio/voices?model=... query voice.
returnresult voice_id.
voice_id /v1/audio/tasks.

13. Details

13.1 Voice Cloning

reference audio, requires, useVoice Cloning.

scenario:

audio.
audio.
TTS reference.

13.2 Voice Design

reference audio, description, useVoice Design.

scenario:

digital human.
video generation voice.
generation.
fast or voice.

13.3 voice recommended

recommended display_name usestable, name.

Example:

brand-female-presenter
course-teacher-male
customer-service-clear
game-guide-young
english-commercial-premium

13.4 voice_id

createSuccess, Save:

voice_id
display_name
source_type
preview_audio_url
compatible_models
created_at

Yes voice_id. TTS call requirespass voice_id.

13.5 callrecommended

Voice CloningandVoice Design TTS, recommended.
do not API Key.
recommended service call aijisu API.
voice_id, create voice.
create audioand.

1. supportsCapability​

2. EndpointURL​

3. Authentication​

4. voice ID Note​

5. List Voices​

5.1 Details​

5.2 queryall voice​

5.3 query Speech 2.8 HD voice​

5.4 query Speech 02 HD voice​

5.5 Response Examples​

5.6 Response Fields​

6. Voice Cloning​

6.1 Details​

6.2 reference audiorecommended​

6.3 Request Examples​

6.4 and​

6.5 English Voice Cloning​

6.6 Response Examples​

7. Voice Design​

7.1 Details​

7.2 Prompt recommended​

7.3 Chinese​

7.4 digital human voice​

7.5 customer-service broadcastvoice​

7.6 voice​

7.7 English voice​

7.8 audiobook voice​

7.9 Response Examples​

8. voice_id speech generation​

8.1 useVoice DesignvoicegenerationChinese​

8.2 useVoice Cloningvoicegenerationcourse explanation​

8.3 query speech generationtask​

8.4 speech generationtaskCompleted Response Example​

9. Billing Notes​

10. Node.js Example​

10.1 Voice Design​

10.2 generation voice_id createspeech task​

11. Python Example​

11.1 Voice Cloning​

11.2 List Voices​

12. Common Errors​

12.1 Authentication​

12.2 Voice Cloning audio_url​

12.3 Voice Design prompt​

12.4 Voice Design preview_text​

12.5 voice_id unavailable​

13. Details​

13.1 Voice Cloning​

13.2 Voice Design​

13.3 voice recommended​

13.4 voice_id​

13.5 callrecommended​