Skip to main content
AI

MiniMax Speech HD Generation

This document explains how to use MiniMax Speech HD async speech generation model.

Supported models:

Model nameTypeRecommended scenarios
minimax-speech-2.8-hdtext-to-speechshort-video narration, ad voiceover, digital-human voice, emotional narration, natural spoken voice
minimax-speech-02-hdtext-to-speechaudiobook, course explanation, customer-service broadcast, news broadcast, long-form narration, multilingual speech

Endpoint async task mode:

OperationMethodPath
submit speech taskPOST/v1/audio/tasks
query speech taskGET/v1/audio/tasks/{task_id}

URL example:

https://api.xxx.xx

1. Model overview

1.1 minimax-speech-2.8-hd

minimax-speech-2.8-hd Yes new speech generation model, requires, and speech.

Suitable scenarios:

  • short-video narration

  • ad voiceover

  • digital human

  • podcast intro

Recommended text example:

<#0.5#>. (laughs)

1.2 minimax-speech-02-hd

minimax-speech-02-hd Yes stable speech generation model, stable, text speech task.

Suitable scenarios:

  • audiobook

  • course explanation

  • news broadcast

  • speech

  • long-form narration

  • multilingual speech generation

Recommended text example:

2. Authentication

requires API Key.

request headers:

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

3. submit speech generationtask

URL:

POST https://api.xxx.xx/v1/audio/tasks

3.1 Request Parameters

ParametersTypeRequiredNote
modelstringYesModel name, supports minimax-speech-2.8-hd, minimax-speech-02-hd
textstringYesgenerationspeech text
inputstringNotext, OpenAI style
voice_idstringNovoice ID, the platformprovides voiceor voice ID
voicestringNovoice name, compatibility field
speednumberNo, range 0.5 2.0
emotionstringNo,happy, sad, angry, fearful, disgusted, surprised, neutral
languagestringNolanguage,Chinese, English, Japanese, auto
output_formatstringNoformat, recommended to use url
response_formatstringNoresponsesformat, recommended to use url
sample_ratenumberNo,32000, 44100
pronunciation_dictobjectNo
timber_weightsarrayNovoice,
subtitle_enablebooleanNoWhether generation
metadataobjectNo
extra_bodyobjectNoParameters

Note:

  • text and input one of two required.
  • recommendedprefer using text.
  • If you pass both text and input.
  • Endpoint async taskEndpoint, submit task requires task_id query result.
  • billing detail, Chinese, English, punctuation, spaces, line breaks, emoji.
  • Example voice_id, voice ID.

3.2 Request Examples

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ", use aijisu speech generationservice. "}'

3.3 submitSuccessResponse Example

{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "queued",
"raw_status": "SUBMITTED",
"progress": "0%",
"audio_url": null,
"result": null,
"error": null
}

4. Query Taskresult

URL:

GET https://api.xxx.xx/v1/audio/tasks/{task_id}

Request Examples:

curl -X GET "https://api.xxx.xx/v1/audio/tasks/task_xxxxxxxxxxxxx" \
-H "Authorization: Bearer YOUR_API_KEY"

4.1 generation Response Example

{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "in_progress",
"raw_status": "IN_PROGRESS",
"progress": "45%",
"audio_url": null,
"result": null,
"error": null
}

4.2 generation completedResponse Example

{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "completed",
"raw_status": "SUCCESS",
"progress": "100%",
"audio_url": "https://example.com/audio.mp3",
"result": {
"audio_url": "https://example.com/audio.mp3",
"outputs": [
"https://example.com/audio.mp3"
],
"audios": [
{
"url": "https://example.com/audio.mp3"
}
]
},
"error": null
}

4.3 generationFailedResponse Example

{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "failed",
"raw_status": "FAILURE",
"progress": "100%",
"audio_url": null,
"result": null,
"error": {
"message": "audio task failed"
}
}

5. task status Values

statusNote
queuedsubmit, processing
in_progressgeneration
processingIn progress
completedgeneration completed
failedgenerationFailed

recommended 2 5 query task status, recommended poll.

6. and

minimax-speech-2.8-hd use and.

Note
<#0.5#>0.5
<#1.0#>1
(laughs)
(sighs)
(coughs)
(clears throat)
(gasps)
(sniffs)
(groans)
(yawns)

Example:

<#0.8#>. (sighs)

7. Use Case Examples

7.1 Chinese short-video narration

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ". <#0.4#>. ",
"voice_id": "Wise_Woman",
"speed": 1.05,
"emotion": "happy",
"output_format": "url"}'

7.2 course explanation

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ". to change. ",
"voice_id": "Wise_Woman",
"speed": 0.95,
"emotion": "neutral",
"output_format": "url"}'

7.3 ad voiceover

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": " new. <#0.3#>,! ",
"voice_id": "Wise_Woman",
"speed": 1.12,
"emotion": "happy",
"output_format": "url"}'

7.4 audiobook

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ".. ",
"voice_id": "Wise_Woman",
"speed": 0.88,
"emotion": "neutral",
"output_format": "url"}'

7.5 Details

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ". <#0.6#>. (sighs)",
"voice_id": "Wise_Woman",
"speed": 0.92,
"emotion": "sad",
"output_format": "url"}'

7.6 Englishpodcast intro

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": "Hey, welcome back to the show. <#0.4#> Today we are talking about how AI is changing creative work. (laughs)",
"voice_id": "Wise_Woman",
"speed": 1.0,
"emotion": "happy",
"language": "English",
"output_format": "url"}'

7.7 language

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ", service. Please hold on for a moment. service. ",
"voice_id": "Wise_Woman",
"speed": 1.0,
"language": "auto",
"emotion": "neutral",
"output_format": "url"}'

7.8 news broadcast

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": " new:, new. ",
"voice_id": "Wise_Woman",
"speed": 1.0,
"emotion": "neutral",
"output_format": "url"}'

7.9 digital human

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ", Yes AI. <#0.4#>. ",
"voice_id": "Wise_Woman",
"speed": 1.03,
"emotion": "happy",
"output_format": "url"}'

7.10 Details

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": "., Yes. ",
"voice_id": "Wise_Woman",
"speed": 0.9,
"emotion": "happy",
"output_format": "url"}'

7.11 speech

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ". Operation. ",
"voice_id": "Wise_Woman",
"speed": 0.96,
"emotion": "neutral",
"output_format": "url"}'

7.12 Details

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ". <#1.0#>. <#1.0#>. ",
"voice_id": "Wise_Woman",
"speed": 0.82,
"emotion": "neutral",
"output_format": "url"}'

7.13 use input Fieldsubmit

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"input": " Yes use input Fieldsubmit speech generationtask. ",
"voice_id": "Wise_Woman",
"output_format": "url"}'

7.14 Details

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": " use AI, new the platform. ",
"voice_id": "Wise_Woman",
"output_format": "url",
"pronunciation_dict": {"tone_list": ["AI /(A)(I)(ji2)(su4)"]}}'

7.15 audio

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": " Yesa video quality. ",
"voice_id": "Wise_Woman",
"sample_rate": 44100,
"output_format": "url"}'

7.16 IVR

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": "., query, service. ",
"voice_id": "Wise_Woman",
"speed": 0.98,
"emotion": "neutral",
"output_format": "url"}'

7.17 video

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": " Yes. <#0.4#> completed, and generation. ",
"voice_id": "Wise_Woman",
"speed": 1.02,
"emotion": "happy",
"output_format": "url"}'

7.18 Details

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-02-hd",
"text": ", and. ",
"voice_id": "Wise_Woman",
"speed": 0.9,
"emotion": "neutral",
"output_format": "url"}'

8. JavaScript callExample

const API_KEY = "YOUR_API_KEY";
const BASE_URL = "https://api.xxx.xx";

async function createAudioTask() {const response = await fetch(`${BASE_URL}/v1/audio/tasks`, {method: "POST",
headers: {"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json"},
body: JSON.stringify({model: "minimax-speech-2.8-hd",
text: ", Yesa aijisu generation speech. ",
voice_id: "Wise_Woman",
speed: 1,
emotion: "neutral",
output_format: "url"})});

if (!response.ok) {throw new Error(await response.text());}

return await response.json();}

async function getAudioTask(taskId) {const response = await fetch(`${BASE_URL}/v1/audio/tasks/${taskId}`, {method: "GET",
headers: {"Authorization": `Bearer ${API_KEY}`}});

if (!response.ok) {throw new Error(await response.text());}

return await response.json();}

async function main() {const task = await createAudioTask();
console.log("task_id:", task.task_id);

while (true) {const result = await getAudioTask(task.task_id);
console.log(result.status, result.progress);

if (result.status === "completed") {console.log("audio_url:", result.audio_url);
break;}

if (result.status === "failed") {console.error("failed:", result.error);
break;}

await new Promise(resolve => setTimeout(resolve, 3000));}}

main().catch(console.error);

9. Python callExample

import time
import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.xxx.xx"

headers = {"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"}

payload = {"model": "minimax-speech-02-hd",
"text": ", Yesa Python submit speech generationtask. ",
"voice_id": "Wise_Woman",
"speed": 1,
"emotion": "neutral",
"output_format": "url"}

create_resp = requests.post(f"{BASE_URL}/v1/audio/tasks",
headers=headers,
json=payload)
create_resp.raise_for_status()

task = create_resp.json()
task_id = task["task_id"]

while True:
query_resp = requests.get(f"{BASE_URL}/v1/audio/tasks/{task_id}",
headers={"Authorization": f"Bearer {API_KEY}"})
query_resp.raise_for_status()

result = query_resp.json()
print(result["status"], result.get("progress"))

if result["status"] == "completed":
print("audio_url:", result.get("audio_url"))
break

if result["status"] == "failed":
print("failed:", result.get("error"))
break

time.sleep(3)

10. Billing Notes

speech generation billing detail.:

  • Chinese

  • English

  • punctuation

  • spaces

  • line breaks

  • emoji

Example:

6 characters.

billing model pricing, rules and account balance rules.

11.1 prefer using minimax-speech-2.8-hd:

  • requires
  • requires,
  • short-video narration
  • ad voiceover
  • digital human
  • speech

11.2 prefer using minimax-speech-02-hd:

  • long-form narration
  • audiobook
  • course explanation
  • customer-service broadcast
  • news broadcast
  • language
  • stable scenario

12.1 short-video narration

{
"model": "minimax-speech-2.8-hd",
"text": ". <#0.4#>. ",
"voice_id": "Wise_Woman",
"speed": 1.05,
"emotion": "happy",
"output_format": "url"
}

12.2 audiobook

{
"model": "minimax-speech-02-hd",
"text": ". ",
"voice_id": "Wise_Woman",
"speed": 0.88,
"emotion": "neutral",
"output_format": "url"
}

12.3 customer-service broadcast

{
"model": "minimax-speech-02-hd",
"text": ", service., service. ",
"voice_id": "Wise_Woman",
"speed": 1,
"emotion": "neutral",
"output_format": "url"
}

12.4 Details

{
"model": "minimax-speech-2.8-hd",
"text": "? <#0.8#>. (sighs)",
"voice_id": "Wise_Woman",
"speed": 0.92,
"emotion": "sad",
"output_format": "url"
}

12.5 English

{
"model": "minimax-speech-2.8-hd",
"text": "Welcome back. <#0.4#> Today we are going to talk about how creators can use AI to work faster.",
"voice_id": "Wise_Woman",
"speed": 1,
"emotion": "happy",
"language": "English",
"output_format": "url"
}

13. Details

13.1 submit returnaudio?

speech generationYesasync task. submit endpoint returntask ID, requires queryEndpoint audio URL.

13.2 text and input?

input Yes text. recommendedprefer using text.

13.3 generation audio?

recommended generation audio. If required audio, recommended task submit.

13.4 text processing?

recommended task. controlFailed, and.

13.5 speech?

recommended:

  • reservedpunctuation

  • do not use

  • scenario speed -, digital humanprefer using minimax-speech-2.8-hd

13.6 audio URL?

task completed return audio_url, download or processing.

13.7 Failed?:

  • API Key or
  • model
  • text or input
  • pass both text and input,
  • voice ID unavailable
  • Parametersformat
  • accountbalance or use Model

14. Example

###: submit task

curl -X POST "https://api.xxx.xx/v1/audio/tasks" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "minimax-speech-2.8-hd",
"text": ", Yes speech. ",
"voice_id": "Wise_Woman",
"speed": 1,
"emotion": "neutral",
"output_format": "url"}'

return:

{
"id": "task_xxxxxxxxxxxxx",
"task_id": "task_xxxxxxxxxxxxx",
"object": "audio.generation.job",
"status": "queued",
"raw_status": "SUBMITTED",
"progress": "0%",
"audio_url": null,
"result": null,
"error": null
}

###: Query Task

curl -X GET "https://api.xxx.xx/v1/audio/tasks/task_xxxxxxxxxxxxx" \
-H "Authorization: Bearer YOUR_API_KEY"

###: audio URL

status completed, read:

{
"audio_url": "https://example.com/audio.mp3"
}

ordownloadgeneration audio.