Voice Guide

Guide to using a saved voice or a Voice Library voice in your TTS API requests.

Hume’s text-to-speech (TTS) API lets you specify which voice to use when synthesizing speech. You can use a custom voice that you have saved or select one from Hume’s Voice Library. If you omit the voice field, the model will generate one dynamically based on the input text and optional description.

This guide explains how to specify a voice in both standard and streaming TTS requests.

To learn how to create or manage voices, see the Voice Design Guide and Voice Management Guide.

Voice reference options

You can specify a voice in your request using either its id or name. Each voice belongs to a provider, which indicates the source of the voice and who can access it.

The provider field accepts the following values:

ProviderDescription
CUSTOM_VOICE

Select from designed or cloned voices you’ve saved to your account. These voices are private.

HUME_AI

Select from Hume’s shared Voice Library of predesigned voices. These voices are public.

If you omit the provider field, it defaults to CUSTOM_VOICE. To use a voice from the Voice Library, you must explicitly set the provider to HUME_AI.

Specify a saved voice by name
1{
2 "voice": {
3 "name": "My Custom Voice",
4 // "provider": "CUSTOM_VOICE" (optional)
5 }
6}
Specify a saved voice by ID
1{
2 "voice": {
3 "id": "795c949a-1510-4a80-9646-7d0863b023ab",
4 // "provider": "CUSTOM_VOICE" (optional)
5 }
6}

You can find voice IDs and names using the List Voices endpoint or in the My Voices section of the Platform UI.

Specify a voice in your request

To specify a voice for speech synthesis, include the voice field in the first utterance of your request. That voice will be used for all subsequent utterances unless you override it in a later utterance.

Both standard and streaming TTS endpoints support voice selection. The request body format is identical across both.

1curl https://api.hume.ai/v0/tts \
2 -H "X-Hume-Api-Key: $HUME_API_KEY" \
3 --json '{
4 "utterances": [
5 {
6 "text": "Beauty is no quality in things themselves: It exists merely in the mind which contemplates them.",
7 "voice": {
8 "id": "9e068547-5ba4-4c8e-8e03-69282a008f04",
9 "provider": "HUME_AI"
10 }
11 }
12 ]
13}'

Instant mode is enabled by default for streaming endpoints. This mode requires a voice to be specified. If you omit the voice, the request will return an error.

Resources