Voice | Hume API

Octave 2 (preview) and EVI 4-mini are live! Expanded language support and lower latency for faster, more natural responses. Learn more.

Voice is foundational to any system that generates speech. It sets the tone, style, and pacing for how content is delivered. Whether it’s the friendly demeanor of a virtual assistant, the immersive narration of an audiobook, or the distinct personality of a character, the chosen voice shapes the listener’s experience.

Octave is Hume’s speech-language model for generating expressive speech with LLM intelligence. Unlike conventional TTS systems that rely on acoustic templates or phoneme-based pipelines, Octave understands what the text means and how it should be spoken.

Voices, whether selected from the Voice Library or created using prompts, are used in Hume’s two voice products: Empathic Voice Interface (EVI) and Text-to-Speech (TTS). If you’re getting started with either, selecting or designing a voice is often your first step.

Empathic Voice Interface (EVI)

Real-time, emotionally intelligent voice AI for conversational interfaces.

Text-to-Speech (TTS)

Synthesize expressive speech from text using Octave.

Try our free voice design demo to hear how Octave generates expressive speech from natural language descriptions — no signup or code required.

Voice design

Octave deeply models language and speech patterns to generate new voices from natural language descriptions. These prompts can specify tone, emotion, accent, and other stylistic traits with a high degree of control.

The Voice Library offers over 100 voices crafted by Hume with Octave, each reflecting a unique style, personality, or accent. These voices can be used directly or serve as inspiration for creating your own.

Voice Design Guide

See the Voice Design Guide for how to design and create a custom voice.

Voice Library

Visit the Voice Library to explore Hume’s predesigned voices.

Voice cloning

While Octave supports voice design from natural language descriptions, it can also create voices from audio samples, reflecting the speaker’s tone, accent, cadence, and vocal identity.

Voice Cloning Guide

Create a voice clone from a live recording or an audio file.

Voice management

Manage your custom voices using the Platform UI or programmatically through the API. Use the guide below that best matches your preferred workflow.

Voice Management Guide

View, rename, and delete custom voices via the Platform or API.

Voice integration

Voices you design or select from the Voice Library can be used across all Hume products that support speech synthesis. The guides below explain how to configure a voice for each API.

Empathic Voice Interface (EVI)

Configure EVI to use a specified voice.

Text-to-Speech (TTS)

Specify a voice in your TTS requests.