Stream Input

Generate emotionally expressive speech.

HandshakeTry it

WSS
wss://api.hume.ai/v0/tts/stream/input

Query Parameters

access_tokenstringOptionalDefaults to
Access token used for authenticating the client. If not provided, an `api_key` must be provided to authenticate. The access token is generated using both an API key and a Secret key, which provides an additional layer of security compared to using just an API key. For more details, refer to the [Authentication Strategies Guide](/docs/introduction/api-key#authentication-strategies).
context_generation_idstringOptional
The ID of a prior TTS generation to use as context for generating consistent speech style and prosody across multiple requests. Including context may increase audio generation times.
format_typeenumOptional
The format to be used for audio generation.
include_timestamp_typeslist of enumsOptional
The set of timestamp types to include in the response.
Allowed values:
instant_modebooleanOptionalDefaults to true

Enables ultra-low latency streaming, significantly reducing the time until the first audio chunk is received. Recommended for real-time applications requiring immediate audio playback. For further details, see our documentation on instant mode.

no_binarybooleanOptionalDefaults to false
If enabled, no binary websocket messages will be sent to the client.
strip_headersbooleanOptionalDefaults to false

If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk’s audio will be its own audio file, each with its own headers (if applicable).

versionenumOptional
The version of the Octave Model to use. 1 for the legacy model, 2 for the new model.
api_keystringOptionalDefaults to

API key used for authenticating the client. If not provided, an access_token must be provided to authenticate.

For more details, refer to the Authentication Strategies Guide.

Send

InputMessageobjectRequired

Receive

TtsOutputobjectRequired