Voice Guide
Hume’s text-to-speech (TTS) API lets you specify which voice to use when synthesizing speech. You can use a custom voice that you have saved or select one from Hume’s Voice Library.
This guide explains how to specify a voice across all of Hume’s TTS endpoints.
To learn how to create or manage voices, see the Voice Design Guide, Voice Cloning Guide, and Voice Management Guide.
Voice reference options
You can specify a voice by name
or id
. If you use name
, include a provider
(defaults to CUSTOM_VOICE
). To reference a voice from Hume’s Voice Library by name, set the provider
to HUME_AI
.
By ID
By Name
Get voice IDs and names from /v0/tts/voices or from the Platform’s Voice Library page.
Specify a voice in your request
To set a voice, include the voice field in the first utterance of your request. That voice is used for all following utterances unless you override it later.
Voice specification works the same across streaming and non-streaming endpoints. The code snippets below demonstrate how to set the voice in your TTS request.
Instant mode is enabled by default for streaming endpoints. This mode requires a voice to be specified. If you omit the voice, the request will return an error.
Resources
Learn how to design and create custom voices.
Create a voice clone from a live recording or an audio file.
Control speech delivery using expressive performance cues.
Generate speech that leverages previous generations as context.