For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Start buildingGet support
DocumentationAPI ReferenceChangelogDiscord
  • Introduction
    • Welcome to Hume AI
    • Getting your API keys
    • Support
    • Pricing
  • Voice
    • Overview
    • Voice design
    • Voice cloning
    • Voice management
  • Text-to-Speech (TTS)
    • Overview
    • Voice
    • Acting instructions
    • Voice conversion
    • Continuation
    • Timestamps
    • FAQ
  • Speech-to-Speech (EVI)
    • Overview
    • FAQ
  • Expression Measurement
    • Overview
    • About the science
    • FAQ
  • Integrations
    • MCP
    • Vercel AI SDK
    • LiveKit
    • Pipecat
    • Vapi
    • Twilio
    • Agora
  • Resources
    • Terms of use
    • Use case guidelines
    • Billing
    • Errors
    • Privacy
    • Status
Start buildingGet support
LogoLogo
LogoLogo
On this page
  • Authentication
  • Usage
  • Building a Conversational AI Agent
  • Best Practices
  • Constraints
  • Resources
Integrations

Agora

Guide to integrating Hume TTS with Agora's Conversational AI Engine.
Was this page helpful?
Edit this page
Previous

Use case guidelines

Next
Built with

Agora is a real-time communication and conversational AI platform. With Agora’s API, developers can build AI voice agents with any LLM and integrate with Hume’s expressive text-to-speech API for high-quality voice synthesis.

Hume’s expressive TTS can be integrated into your Agora agents to deliver natural, emotionally-aware speech in conversational AI. This guide covers setup instructions, integration patterns, and configuration best practices for using Hume TTS with Agora.

Wanna get right to the code? See our complete Agora example project on GitHub.

Authentication

To use Hume TTS with Agora, you’ll need both Hume and Agora credentials. Follow these steps to obtain your credentials and set up environment variables.

1

Get your Hume API key

To get your Hume API key, sign in to the Hume Platform and follow the Getting your API key guide.

2

Get your Agora credentials

Sign up for an Agora account and create a project in the Agora Console. Copy the following credentials from your project dashboard: Agora App ID, Certificate, Customer ID, and Secret

3

Configure environment variables

Create a .env.local file in your Next.js project and define the required environment variables:

.env.local
# Agora Configuration
NEXT_PUBLIC_AGORA_APP_ID=
NEXT_PUBLIC_AGORA_APP_CERTIFICATE=
NEXT_PUBLIC_AGORA_CUSTOMER_ID=
NEXT_PUBLIC_AGORA_CUSTOMER_SECRET=
NEXT_PUBLIC_AGORA_CONVO_AI_BASE_URL=https://api.agora.io/api/conversational-ai-agent/v2/projects/
NEXT_PUBLIC_AGENT_UID=
# LLM Configuration
NEXT_PUBLIC_LLM_URL=https://api.openai.com/v1/chat/completions
NEXT_PUBLIC_LLM_MODEL=gpt-4
NEXT_PUBLIC_LLM_API_KEY=
# TTS Configuration
NEXT_PUBLIC_TTS_VENDOR=hume
# Hume Configuration
NEXT_PUBLIC_HUME_API_KEY=
NEXT_PUBLIC_HUME_VOICE_ID=
# Modalities Configuration
NEXT_PUBLIC_INPUT_MODALITIES=text,audio
NEXT_PUBLIC_OUTPUT_MODALITIES=text,audio

Usage

Agora’s Conversational AI Engine enables you to build voice AI agents with any LLM by orchestrating the complete speech-to-speech pipeline: automatic speech recognition (ASR) converts user speech to text, your chosen LLM processes the text and generates a response, and Hume TTS synthesizes the LLM’s text output into natural, expressive speech.

Building a Conversational AI Agent

The Conversational AI Engine handles the entire voice interaction flow, allowing you to focus on configuring your LLM and TTS provider. When using Hume TTS, the Agora engine manages audio streaming and interruption handling.

Integration workflow:

  1. Configure your LLM: Connect any LLM provider (OpenAI, Azure OpenAI, Google Gemini, Anthropic Claude, or a custom model) to generate responses to user speech.

  2. Set Hume as your TTS provider: Configure Hume TTS in your Agora agent to synthesize the LLM’s text responses into natural, emotionally-aware speech.

  3. Select a voice: Choose from Hume’s extensive Voice Library or use a custom voice you’ve created for consistent agent personality.

  4. Deploy your agent: Agora’s engine handles real-time audio streaming, interruption detection, and maintains the conversation flow between the user and your AI agent.

Configuration example:

For a complete Next.js implementation with Agora and Hume TTS, see our Agora example project.

Sample Configuration
1"tts": {
2"vendor": "hume",
3"params": {
4 "key": "<HUME_API_KEY>",
5 "voice_id": process.env.NEXT_PUBLIC_HUME_VOICE_ID,
6 "trailing_silence": 0.35,
7 "speed": 1,
8}
9}

Best Practices

When building conversational AI agents with Agora and Hume TTS, consider the following:

  • Voice selection: Choose a voice from Hume’s Voice Library that matches your agent’s personality, or create a custom voice for brand consistency.

  • LLM prompt engineering: Design your LLM prompts to work well with voice interactions: keep responses concise and natural for spoken delivery.

  • Interruption handling: Agora’s Conversational AI Engine automatically handles interruptions, allowing users to interrupt the agent mid-response for more natural conversations.

Constraints

  • Audio format compatibility: Hume TTS outputs audio at 48kHz sample rate. Agora supports various sample rates; ensure proper resampling if your Agora configuration requires a different rate.

  • One utterance per request: Each Hume TTS API request processes a single utterance. Split multi-utterance text into separate requests for granular control.

Resources

Agora Conversational AI Engine

Reference the official Agora docs for the Conversational AI Engine, including API references, LLM integration, and TTS provider configuration.

Hume TTS Configuration

Learn how to configure Hume AI as your TTS provider in Agora’s Conversational AI Engine.

Agora Example Project

Use a working Next.js example to get started with Hume TTS and Agora’s Conversational AI Engine.

Hume TTS Documentation

Learn more about Hume’s speech-language model, and features of Hume’s TTS API.