For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Start buildingGet support
DocumentationAPI ReferenceChangelogDiscord
  • Introduction
    • Welcome to Hume AI
    • Getting your API keys
    • Support
    • Pricing
  • Voice
    • Overview
    • Voice design
    • Voice cloning
    • Voice management
  • Text-to-Speech (TTS)
    • Overview
    • Voice
    • Acting instructions
    • Voice conversion
    • Continuation
    • Timestamps
    • FAQ
  • Speech-to-Speech (EVI)
    • Overview
    • FAQ
  • Expression Measurement
    • Overview
    • About the science
    • FAQ
  • Integrations
    • MCP
    • Vercel AI SDK
    • LiveKit
    • Pipecat
    • Vapi
    • Twilio
    • Agora
  • Resources
    • Terms of use
    • Use case guidelines
    • Billing
    • Errors
    • Privacy
    • Status
Start buildingGet support
LogoLogo
LogoLogo
On this page
  • Speech-to-Speech (EVI)
  • Text-to-Speech (TTS)
  • Expression Measurement
  • Voice
  • SDKs
  • Example Code
  • Get Support
Introduction

Welcome to Hume AI

Was this page helpful?
Edit this page

Getting your API keys

Learn how to obtain your API keys and understand the supported authentication strategies for securely accessing Hume APIs.
Next
Built with

Octave 2 (preview) and EVI 4-mini are live! Expanded language support and lower latency for faster, more natural responses. Learn more.

Hume is a research lab and technology company with a mission to ensure that artificial intelligence is built to serve human goals and emotional well-being.

Hume develops two categories of models: speech-language models that interpret and generate expressive speech, and expression measurement models that analyze vocal, facial, and verbal expression.

These models are available through three APIs: the Empathic Voice Interface (EVI) for real-time voice interaction, Text-to-Speech (TTS) for expressive speech synthesis, and Expression Measurement for analyzing expression in media and text.

Speech-to-Speech (EVI)

Hume’s Empathic Voice Interface (EVI) is an advanced, real-time emotionally intelligent voice AI. EVI measures users’ nuanced vocal modulations and responds to them using a speech-language model, which guides language and speech generation. Trained on millions of human interactions, our speech-language model unites language modeling and text-to-speech with better EQ, prosody, end-of-turn detection, interruptibility, and alignment.

  • Interviewing & Coaching: Simulate lifelike interviews or leadership coaching sessions with dynamic tone adjustment.
  • Digital Companions: Build emotionally aware companions for seniors, kids, or mental wellness support.
  • Digital Assistants: Respond with empathy and modulate tone to reduce user frustration or improve engagement.
Playground

Visit our Platform’s no-code interface for testing and configuring EVI.

API Reference

See our API reference for EVI WebSocket and REST endpoints.

Text-to-Speech (TTS)

Octave TTS is the first text-to-speech system built on LLM intelligence. Unlike conventional TTS that merely “reads” words, Octave is a “speech-language model” that understands what words mean in context, unlocking a new level of expressiveness and nuance.

  • Creative Tools: Narration for video, podcasting, and audiobooks.
  • Education/Coaching: Deliver lessons with engaging, emotionally varied voice.
  • Digital Avatars: Give realistic voice to AI-powered characters in apps, games, or virtual experiences.
Playground

Check out our Platform’s no-code interface for testing Octave’s capabilities.

API Reference

See our API reference for TTS streaming and non-streaming endpoints.

Expression Measurement

Hume’s state-of-the-art expression measurement models for the voice, face, and language are built on 10+ years of research and advances in semantic space theory pioneered by Alan Cowen. Our expression measurement models are able to capture hundreds of dimensions of human expression in audio, video, and images.

  • Health & Wellness: Monitor patient tone and emotion during therapy or check-ins.
  • Call Center Analytics: Detect caller frustration or distress for triage and escalation.
  • UX/CX Research: Analyze user interviews and testing sessions for sentiment trends.
Playground

Explore our Platform’s no-code interface for testing Hume’s expression measurement models.

API Reference

See our API reference for streaming and batch expression measurement endpoints.

Voice

Voice defines how speech is delivered, shaping tone, pacing, accent, and personality. It plays a central role in how listeners perceive meaning and emotion.

All voices in Hume’s platform are powered by Octave, a speech-language model built on LLM intelligence. Octave enables expressive, context-aware speech generation from both text and natural language descriptions.

Voices can be used across both EVI and TTS to tailor how content is spoken.

Voice Library

Explore over 100 expressive voices designed by Hume and available for immediate use.

Voice Design

Learn how to create custom voices using descriptive prompts and Octave’s expressive generation.

Voice Cloning

Clone a voice from a recorded or uploaded speech sample with user consent.

SDKs

Jumpstart your development with SDKs built for Hume APIs. They handle authentication, requests, and workflows to make integration straightforward. With support for React, TypeScript, and Python, our SDKs provide the tools you need to build efficiently across different environments.

React logo
React SDK

Integrate Hume’s Empathic Voice Interface into React apps with tools for audio recording, playback, and API interaction

TypeScript logo
TypeScript SDK

Integrate Hume APIs directly into your Node application or frontend Web applications

Python logo
Python SDK

Access Hume’s APIs in Python with async/sync clients, error handling, and streaming tools

Swift logo
Swift SDK

Build iOS and macOS apps with EVI voice chat, microphone capture, realtime playback, and TTS file streaming

.NET logo
.NET SDK

Use Hume’s APIs in .NET with typed TTS clients, automatic retries, pagination, and configurable timeouts

Example Code

Explore step-by-step guides and sample projects for integrating Hume APIs. Our GitHub repositories include ready-to-use code and open-source SDKs to support your development process in various environments.

hume-api-examples

Browse sample code and projects designed to help you integrate Hume APIs

GitHub Organization

Explore all of Hume’s open-source SDKs, examples, and public-facing code

Get Support

Need help? Our team is here to support you.

Discord

Join our Discord community for direct support from the Hume team