MiniMax Speech HD Generation
This document explains how to use MiniMax Speech HD async speech generation model.
Supported models:
| Model name | Type | Recommended scenarios |
|---|---|---|
minimax-speech-2.8-hd | text-to-speech | short-video narration, ad voiceover, digital-human voice, emotional narration, natural spoken voice |
minimax-speech-02-hd | text-to-speech | audiobook, course explanation, customer-service broadcast, news broadcast, long-form narration, multilingual speech |
Endpoint async task mode:
| Operation | Method | Path |
|---|---|---|
| submit speech task | POST | /v1/audio/tasks |
| query speech task | GET | /v1/audio/tasks/{task_id} |
URL example:
https:
1. Model overview
1.1 minimax-speech-2.8-hd
minimax-speech-2.8-hd Yes new speech generation model, requires, and speech.
Suitable scenarios:
-
short-video narration
-
ad voiceover
-
digital human
-
podcast intro
Recommended text example:
<#0.5#>. (laughs)
1.2 minimax-speech-02-hd
minimax-speech-02-hd Yes stable speech generation model, stable, text speech task.
Suitable scenarios:
-
audiobook
-
course explanation
-
news broadcast
-
speech
-
long-form narration
-
multilingual speech generation
Recommended text example:
2. Authentication
requires API Key.
request headers:
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json
3. submit speech generationtask
URL:
POST https://api.xxx.xx/v1/audio/tasks
3.1 Request Parameters
| Parameters | Type | Required | Note |
|---|---|---|---|
model | string | Yes | Model name, supports minimax-speech-2.8-hd, minimax-speech-02-hd |
text | string | Yes | generationspeech text |
input | string | No | text, OpenAI style |
voice_id | string | No | voice ID, the platformprovides voiceor voice ID |
voice | string | No | voice name, compatibility field |
speed | number | No | , range 0.5 2.0 |
emotion | string | No | ,happy, sad, angry, fearful, disgusted, surprised, neutral |
language | string | No | language,Chinese, English, Japanese, auto |
output_format | string | No | format, recommended to use url |
response_format | string | No | responsesformat, recommended to use url |
sample_rate | number | No | ,32000, 44100 |
pronunciation_dict | object | No | |
timber_weights | array | No | voice, |
subtitle_enable | boolean | No | Whether generation |
metadata | object | No | |
extra_body | object | No | Parameters |
Note:
textandinputone of two required.- recommendedprefer using
text. - If you pass both
textandinput. - Endpoint async taskEndpoint, submit task requires
task_idquery result. - billing detail, Chinese, English, punctuation, spaces, line breaks, emoji.
- Example
voice_id, voice ID.
3.2 Request Examples
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ", use aijisu speech generationservice. "}'
3.3 submitSuccessResponse Example
{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "queued",
"raw_status": "SUBMITTED",
"progress": "0%",
"audio_url": null,
"result": null,
"error": null
}
4. Query Taskresult
URL:
GET https://api.xxx.xx/v1/audio/tasks/{task_id}
Request Examples:
curl -X GET "https://api.xxx.xx/v1/audio/tasks/task_xxxxxxxxxxxxx" \
-H "Authorization: Bearer YOUR_API_KEY"
4.1 generation Response Example
{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "in_progress",
"raw_status": "IN_PROGRESS",
"progress": "45%",
"audio_url": null,
"result": null,
"error": null
}
4.2 generation completedResponse Example
{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "completed",
"raw_status": "SUCCESS",
"progress": "100%",
"audio_url": "https://example.com/audio.mp3",
"result": {
"audio_url": "https://example.com/audio.mp3",
"outputs": [
"https://example.com/audio.mp3"
],
"audios": [
{
"url": "https://example.com/audio.mp3"
}
]
},
"error": null
}
4.3 generationFailedResponse Example
{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "failed",
"raw_status": "FAILURE",
"progress": "100%",
"audio_url": null,
"result": null,
"error": {
"message": "audio task failed"
}
}
5. task status Values
| status | Note |
|---|---|
queued | submit, processing |
in_progress | generation |
processing | In progress |
completed | generation completed |
failed | generationFailed |
recommended 2 5 query task status, recommended poll.
6. and
minimax-speech-2.8-hd use and.
| Note | |
|---|---|
<#0.5#> | 0.5 |
<#1.0#> | 1 |
(laughs) | |
(sighs) | |
(coughs) | |
(clears throat) | |
(gasps) | |
(sniffs) | |
(groans) | |
(yawns) |
Example:
<#0.8#>. (sighs)
7. Use Case Examples
7.1 Chinese short-video narration
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ". <#0.4#>. ",
"voice_id": "Wise_Woman",
"speed": 1.05,
"emotion": "happy",
"output_format": "url"}'
7.2 course explanation
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ". to change. ",
"voice_id": "Wise_Woman",
"speed": 0.95,
"emotion": "neutral",
"output_format": "url"}'
7.3 ad voiceover
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": " new. <#0.3#>,! ",
"voice_id": "Wise_Woman",
"speed": 1.12,
"emotion": "happy",
"output_format": "url"}'
7.4 audiobook
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ".. ",
"voice_id": "Wise_Woman",
"speed": 0.88,
"emotion": "neutral",
"output_format": "url"}'
7.5 Details
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ". <#0.6#>. (sighs)",
"voice_id": "Wise_Woman",
"speed": 0.92,
"emotion": "sad",
"output_format": "url"}'
7.6 Englishpodcast intro
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": "Hey, welcome back to the show. <#0.4#> Today we are talking about how AI is changing creative work. (laughs)",
"voice_id": "Wise_Woman",
"speed": 1.0,
"emotion": "happy",
"language": "English",
"output_format": "url"}'
7.7 language
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ", service. Please hold on for a moment. service. ",
"voice_id": "Wise_Woman",
"speed": 1.0,
"language": "auto",
"emotion": "neutral",
"output_format": "url"}'
7.8 news broadcast
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": " new:, new. ",
"voice_id": "Wise_Woman",
"speed": 1.0,
"emotion": "neutral",
"output_format": "url"}'
7.9 digital human
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ", Yes AI. <#0.4#>. ",
"voice_id": "Wise_Woman",
"speed": 1.03,
"emotion": "happy",
"output_format": "url"}'
7.10 Details
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": "., Yes. ",
"voice_id": "Wise_Woman",
"speed": 0.9,
"emotion": "happy",
"output_format": "url"}'
7.11 speech
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ". Operation. ",
"voice_id": "Wise_Woman",
"speed": 0.96,
"emotion": "neutral",
"output_format": "url"}'
7.12 Details
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ". <#1.0#>. <#1.0#>. ",
"voice_id": "Wise_Woman",
"speed": 0.82,
"emotion": "neutral",
"output_format": "url"}'
7.13 use input Fieldsubmit
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"input": " Yes use input Fieldsubmit speech generationtask. ",
"voice_id": "Wise_Woman",
"output_format": "url"}'
7.14 Details
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": " use AI, new the platform. ",
"voice_id": "Wise_Woman",
"output_format": "url",
"pronunciation_dict": {"tone_list": ["AI /(A)(I)(ji2)(su4)"]}}'
7.15 audio
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": " Yesa video quality. ",
"voice_id": "Wise_Woman",
"sample_rate": 44100,
"output_format": "url"}'
7.16 IVR
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": "., query, service. ",
"voice_id": "Wise_Woman",
"speed": 0.98,
"emotion": "neutral",
"output_format": "url"}'
7.17 video
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": " Yes. <#0.4#> completed, and generation. ",
"voice_id": "Wise_Woman",
"speed": 1.02,
"emotion": "happy",
"output_format": "url"}'
7.18 Details
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ", and. ",
"voice_id": "Wise_Woman",
"speed": 0.9,
"emotion": "neutral",
"output_format": "url"}'
8. JavaScript callExample
const API_KEY = "YOUR_API_KEY";
const BASE_URL = "https://api.xxx.xx";
async function createAudioTask() {const response = await fetch(`${BASE_URL}/v1/audio/tasks`, {method: "POST",
headers: {"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json"},
body: JSON.stringify({model: "minimax-speech-2.8-hd",
text: ", Yesa aijisu generation speech. ",
voice_id: "Wise_Woman",
speed: 1,
emotion: "neutral",
output_format: "url"})});
if (!response.ok) {throw new Error(await response.text());}
return await response.json();}
async function getAudioTask(taskId) {const response = await fetch(`${BASE_URL}/v1/audio/tasks/${taskId}`, {method: "GET",
headers: {"Authorization": `Bearer ${API_KEY}`}});
if (!response.ok) {throw new Error(await response.text());}
return await response.json();}
async function main() {const task = await createAudioTask();
console.log("task_id:", task.task_id);
while (true) {const result = await getAudioTask(task.task_id);
console.log(result.status, result.progress);
if (result.status === "completed") {console.log("audio_url:", result.audio_url);
break;}
if (result.status === "failed") {console.error("failed:", result.error);
break;}
await new Promise(resolve => setTimeout(resolve, 3000));}}
main().catch(console.error);
9. Python callExample
import time
import requests
API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.xxx.xx"
headers = {"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"}
payload = {"model": "minimax-speech-02-hd",
"text": ", Yesa Python submit speech generationtask. ",
"voice_id": "Wise_Woman",
"speed": 1,
"emotion": "neutral",
"output_format": "url"}
create_resp = requests.post(f"{BASE_URL}/v1/audio/tasks",
headers=headers,
json=payload)
create_resp.raise_for_status()
task = create_resp.json()
task_id = task["task_id"]
while True:
query_resp = requests.get(f"{BASE_URL}/v1/audio/tasks/{task_id}",
headers={"Authorization": f"Bearer {API_KEY}"})
query_resp.raise_for_status()
result = query_resp.json()
print(result["status"], result.get("progress"))
if result["status"] == "completed":
print("audio_url:", result.get("audio_url"))
break
if result["status"] == "failed":
print("failed:", result.get("error"))
break
time.sleep(3)
10. Billing Notes
speech generation billing detail.:
-
Chinese
-
English
-
punctuation
-
spaces
-
line breaks
-
emoji
Example:
6 characters.
billing model pricing, rules and account balance rules.
11. Model recommended
11.1 prefer using minimax-speech-2.8-hd:
- requires
- requires,
- short-video narration
- ad voiceover
- digital human
- speech
11.2 prefer using minimax-speech-02-hd:
- long-form narration
- audiobook
- course explanation
- customer-service broadcast
- news broadcast
- language
- stable scenario
12. recommended
12.1 short-video narration
{
"model": "minimax-speech-2.8-hd",
"text": ". <#0.4#>. ",
"voice_id": "Wise_Woman",
"speed": 1.05,
"emotion": "happy",
"output_format": "url"
}
12.2 audiobook
{
"model": "minimax-speech-02-hd",
"text": ". ",
"voice_id": "Wise_Woman",
"speed": 0.88,
"emotion": "neutral",
"output_format": "url"
}
12.3 customer-service broadcast
{
"model": "minimax-speech-02-hd",
"text": ", service., service. ",
"voice_id": "Wise_Woman",
"speed": 1,
"emotion": "neutral",
"output_format": "url"
}
12.4 Details
{
"model": "minimax-speech-2.8-hd",
"text": "? <#0.8#>. (sighs)",
"voice_id": "Wise_Woman",
"speed": 0.92,
"emotion": "sad",
"output_format": "url"
}
12.5 English
{
"model": "minimax-speech-2.8-hd",
"text": "Welcome back. <#0.4#> Today we are going to talk about how creators can use AI to work faster.",
"voice_id": "Wise_Woman",
"speed": 1,
"emotion": "happy",
"language": "English",
"output_format": "url"
}
13. Details
13.1 submit returnaudio?
speech generationYesasync task. submit endpoint returntask ID, requires queryEndpoint audio URL.
13.2 text and input?
input Yes text. recommendedprefer using text.
13.3 generation audio?
recommended generation audio. If required audio, recommended task submit.
13.4 text processing?
recommended task. controlFailed, and.
13.5 speech?
recommended:
-
reservedpunctuation
-
do not use
-
scenario
speed-, digital humanprefer usingminimax-speech-2.8-hd
13.6 audio URL?
task completed return audio_url, download or processing.
13.7 Failed?:
- API Key or
modeltextorinput- pass both
textandinput, - voice ID unavailable
- Parametersformat
- accountbalance or use Model
14. Example
###: submit task
curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ", Yes speech. ",
"voice_id": "Wise_Woman",
"speed": 1,
"emotion": "neutral",
"output_format": "url"}'
return:
{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "queued",
"raw_status": "SUBMITTED",
"progress": "0%",
"audio_url": null,
"result": null,
"error": null
}
###: Query Task
curl -X GET "https://api.xxx.xx/v1/audio/tasks/task_xxxxxxxxxxxxx" \
-H "Authorization: Bearer YOUR_API_KEY"
###: audio URL
status completed, read:
{
"audio_url": "https://example.com/audio.mp3"
}
ordownloadgeneration audio.