For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Start buildingGet support
DocumentationAPI ReferenceChangelogDiscord
  • Introduction
    • Welcome to Hume AI
    • Getting your API keys
    • Support
    • Pricing
  • Voice
    • Overview
    • Voice design
    • Voice cloning
    • Voice management
  • Text-to-Speech (TTS)
    • Overview
      • TypeScript
      • Python
      • .NET
      • CLI
    • Voice
    • Acting instructions
    • Voice conversion
    • Continuation
    • Timestamps
    • FAQ
  • Speech-to-Speech (EVI)
    • Overview
    • FAQ
  • Expression Measurement
    • Overview
    • About the science
    • FAQ
  • Integrations
    • MCP
    • Vercel AI SDK
    • LiveKit
    • Pipecat
    • Vapi
    • Twilio
    • Agora
  • Resources
    • Terms of use
    • Use case guidelines
    • Billing
    • Errors
    • Privacy
    • Status
Start buildingGet support
LogoLogo
LogoLogo
On this page
  • Installation
  • Authentication
  • Calling Text-to-Speech
  • Saving voices
  • Continuity
  • Acting Instructions
  • Generating multiple variations
  • Other features
Text-to-Speech (TTS)Quickstart

TTS CLI Quickstart Guide

Step-by-step guide for integrating the TTS API using Hume’s CLI.

Was this page helpful?
Edit this page
Previous

Voice Guide

Guide to using a saved voice or a Voice Library voice in your TTS API requests.
Next
Built with

The Hume CLI provides a simple interface for generating speech, saving voices, and exploring the features of the Hume TTS API. This guide shows how to get started using Hume’s Text-to-Speech capabilities using the Hume CLI. It demonstrates:

  1. Converting text to speech with a new voice.
  2. Saving a voice to your voice library for future use.
  3. Giving “acting instructions” to modulate the voice.
  4. Generating multiple variations of the same text at once.
  5. Providing context to maintain consistency across multiple generations.

Installation

Install the Hume CLI using npm:

$npm install -g @humeai/cli

See usage information by running hume tts --help.

Authentication

Authenticate using the CLI:

$hume login

This will open a browser window to the Hume AI platform, where you can retrieve your API key, and then prompt you to enter your API key.

Calling Text-to-Speech

To use Hume TTS,

  • Provide the text you want to speak as a positional argument.
  • Provide the optional --description flag to control how the voice sounds. If you don’t provide a description, Hume will examine the text and attempt to determine an appropriate voice.
$hume tts "Take an arrow from the quiver." \
> --description "A refined, British aristocrat"

By default, the CLI will

  • save the audio to the output directory (defaults to ./hume-tts-output)
  • attempt to play it automatically.
  • display the generation_id for the speech, for future reference

Saving voices

When you find a voice you like, use the hume voices create command to give it a name and save it to your voice library for future use. You can specify the generation ID:

$hume voices create \
> --name aristocrat \
> --generation-id GENERATION_ID

or, alternatively, use the --last flag to save the most recent generation.

$hume voices create --name aristocrat --last

Continuity

To use a voice from your library, specify its name.

$hume tts "Now take a bow." --voice-name aristocrat

If the speech should sound like it follows from previous speech, you can provide the --context-generation-id flag with the generation_id of the previous speech.

$# For example if PREVIOUS_GENERATION_ID refers to speech
$# about archery, 'bow' will be pronounced to rhyme with
$# 'toe' and not 'cow'.
$hume tts "Now take a bow." \
> --voice-name aristocrat \
> --context-generation-id GENERATION_ID

Alternatively, use the --last flag to continue from the most recent generation.

$hume tts "Now take a bow." --voice-name aristocrat --last

Acting Instructions

If you specify both a voice and a description, the description acts as “acting instructions”. It will keep the character of the specified voice, but modulated to match the description.

$hume tts "Does he even know how to use that thing?" \
> --voice-name aristocrat \
> --description "Murmured softly, with a heavy dose of sarcasm and contempt"

Generating multiple variations

To generate multiple variations of the same text at once, use the --num-generations flag.

$hume tts "Now aim at the bullseye, nock your arrow, draw, and..." \
> --voice-name aristocrat \
> --num-generations 3

Other features

$# Read from stdin
$cat poem.txt | hume tts -
$
$# Machine-readable output
$hume tts "Hello" --reporter-mode json
$
$# Session settings last for the duration of the terminal session
$hume session set tts.voiceName aristocrat
$hume session set tts.outputDir ~/audio
$
$# Global settings persist until changed
$hume config set tts.play none
$hume config set reporterMode json
$# Clear them like this (will also log you out).
$hume config reset