Welcome to Hume AI

Octave 2 (preview) and EVI 4-mini are live! Expanded language support and lower latency for faster, more natural responses. Learn more.

Hume is a research lab and technology company with a mission to ensure that artificial intelligence is built to serve human goals and emotional well-being.

Hume develops two categories of models: speech-language models that interpret and generate expressive speech, and expression measurement models that analyze vocal, facial, and verbal expression.

These models are available through three APIs: the Empathic Voice Interface (EVI) for real-time voice interaction, Text-to-Speech (TTS) for expressive speech synthesis, and Expression Measurement for analyzing expression in media and text.

Speech-to-Speech (EVI)

Hume’s Empathic Voice Interface (EVI) is an advanced, real-time emotionally intelligent voice AI. EVI measures users’ nuanced vocal modulations and responds to them using a speech-language model, which guides language and speech generation. Trained on millions of human interactions, our speech-language model unites language modeling and text-to-speech with better EQ, prosody, end-of-turn detection, interruptibility, and alignment.

Interviewing & Coaching: Simulate lifelike interviews or leadership coaching sessions with dynamic tone adjustment.
Digital Companions: Build emotionally aware companions for seniors, kids, or mental wellness support.
Digital Assistants: Respond with empathy and modulate tone to reduce user frustration or improve engagement.

Playground

Visit our Platform’s no-code interface for testing and configuring EVI.

API Reference

See our API reference for EVI WebSocket and REST endpoints.

Text-to-Speech (TTS)

Octave TTS is the first text-to-speech system built on LLM intelligence. Unlike conventional TTS that merely “reads” words, Octave is a “speech-language model” that understands what words mean in context, unlocking a new level of expressiveness and nuance.

Creative Tools: Narration for video, podcasting, and audiobooks.
Education/Coaching: Deliver lessons with engaging, emotionally varied voice.
Digital Avatars: Give realistic voice to AI-powered characters in apps, games, or virtual experiences.

Playground

Check out our Platform’s no-code interface for testing Octave’s capabilities.

API Reference

See our API reference for TTS streaming and non-streaming endpoints.

Expression Measurement

Hume’s state-of-the-art expression measurement models for the voice, face, and language are built on 10+ years of research and advances in semantic space theory pioneered by Alan Cowen. Our expression measurement models are able to capture hundreds of dimensions of human expression in audio, video, and images.

Health & Wellness: Monitor patient tone and emotion during therapy or check-ins.
Call Center Analytics: Detect caller frustration or distress for triage and escalation.
UX/CX Research: Analyze user interviews and testing sessions for sentiment trends.

Playground

Explore our Platform’s no-code interface for testing Hume’s expression measurement models.

API Reference

See our API reference for streaming and batch expression measurement endpoints.

Voice

Voice defines how speech is delivered, shaping tone, pacing, accent, and personality. It plays a central role in how listeners perceive meaning and emotion.

All voices in Hume’s platform are powered by Octave, a speech-language model built on LLM intelligence. Octave enables expressive, context-aware speech generation from both text and natural language descriptions.

Voices can be used across both EVI and TTS to tailor how content is spoken.

Voice Library

Explore over 100 expressive voices designed by Hume and available for immediate use.

Voice Design

Learn how to create custom voices using descriptive prompts and Octave’s expressive generation.

Voice Cloning

Clone a voice from a recorded or uploaded speech sample with user consent.

SDKs

Jumpstart your development with SDKs built for Hume APIs. They handle authentication, requests, and workflows to make integration straightforward. With support for React, TypeScript, and Python, our SDKs provide the tools you need to build efficiently across different environments.

React SDK

Integrate Hume’s Empathic Voice Interface into React apps with tools for audio recording, playback, and API interaction

TypeScript SDK

Integrate Hume APIs directly into your Node application or frontend Web applications

Python SDK

Access Hume’s APIs in Python with async/sync clients, error handling, and streaming tools

Swift SDK

Build iOS and macOS apps with EVI voice chat, microphone capture, realtime playback, and TTS file streaming

.NET SDK

Use Hume’s APIs in .NET with typed TTS clients, automatic retries, pagination, and configurable timeouts

Example Code

Explore step-by-step guides and sample projects for integrating Hume APIs. Our GitHub repositories include ready-to-use code and open-source SDKs to support your development process in various environments.

hume-api-examples

Browse sample code and projects designed to help you integrate Hume APIs

GitHub Organization

Explore all of Hume’s open-source SDKs, examples, and public-facing code

Get Support

Need help? Our team is here to support you.

Discord

Join our Discord community for direct support from the Hume team