Chat

Chat with Empathic Voice Interface (EVI)

HandshakeTry it

WSS
wss://api.hume.ai/v0/evi/chat

Query Parameters

access_tokenstringOptionalDefaults to
Access token used for authenticating the client. If not provided, an `api_key` must be provided to authenticate. The access token is generated using both an API key and a Secret key, which provides an additional layer of security compared to using just an API key. For more details, refer to the [Authentication Strategies Guide](/docs/introduction/api-key#authentication-strategies).
allow_connectionbooleanOptionalDefaults to false

Allows external connections to this chat via the /connect endpoint.

config_idstringOptional
The unique identifier for an EVI configuration. Include this ID in your connection request to equip EVI with the Prompt, Language Model, Voice, and Tools associated with the specified configuration. If omitted, EVI will apply [default configuration settings](/docs/speech-to-speech-evi/configuration/build-a-configuration#default-configuration). For help obtaining this ID, see our [Configuration Guide](/docs/speech-to-speech-evi/configuration).
config_versionintegerOptional
The version number of the EVI configuration specified by the `config_id`. Configs, as well as Prompts and Tools, are versioned. This versioning system supports iterative development, allowing you to progressively refine configurations and revert to previous versions if needed. Include this parameter to apply a specific version of an EVI configuration. If omitted, the latest version will be applied.
event_limitintegerOptional

The maximum number of chat events to return from chat history. By default, the system returns up to 300 events (100 events per page × 3 pages). Set this parameter to a smaller value to limit the number of events returned.

resumed_chat_group_idstringOptional
The unique identifier for a Chat Group. Use this field to preserve context from a previous Chat session. A Chat represents a single session from opening to closing a WebSocket connection. In contrast, a Chat Group is a series of resumed Chats that collectively represent a single conversation spanning multiple sessions. Each Chat includes a Chat Group ID, which is used to preserve the context of previous Chat sessions when starting a new one. Including the Chat Group ID in the `resumed_chat_group_id` query parameter is useful for seamlessly resuming a Chat after unexpected network disconnections and for picking up conversations exactly where you left off at a later time. This ensures preserved context across multiple sessions. There are three ways to obtain the Chat Group ID: - [Chat Metadata](/reference/speech-to-speech-evi/chat#receive.ChatMetadata): Upon establishing a WebSocket connection with EVI, the user receives a Chat Metadata message. This message contains a `chat_group_id`, which can be used to resume conversations within this chat group in future sessions. - [List Chats endpoint](/reference/speech-to-speech-evi/chats/list-chats): Use the GET `/v0/evi/chats` endpoint to obtain the Chat Group ID of individual Chat sessions. This endpoint lists all available Chat sessions and their associated Chat Group ID. - [List Chat Groups endpoint](/reference/speech-to-speech-evi/chat-groups/list-chat-groups): Use the GET `/v0/evi/chat_groups` endpoint to obtain the Chat Group IDs of all Chat Groups associated with an API key. This endpoint returns a list of all available chat groups.
verbose_transcriptionbooleanOptionalDefaults to false
A flag to enable verbose transcription. Set this query parameter to `true` to have unfinalized user transcripts be sent to the client as interim UserMessage messages. The [interim](/reference/speech-to-speech-evi/chat#receive.UserMessage.interim) field on a [UserMessage](/reference/speech-to-speech-evi/chat#receive.UserMessage) denotes whether the message is "interim" or "final."
api_keystringOptionalDefaults to

API key used for authenticating the client. If not provided, an access_token must be provided to authenticate.

For more details, refer to the Authentication Strategies Guide.

session_settingsobjectOptional

Send

AudioInputobjectRequired
**Base64 encoded audio input to insert into the conversation.** The content is treated as the user's speech to EVI and must be streamed continuously. Pre-recorded audio files are not supported. For optimal transcription quality, the audio data should be transmitted in small chunks. Hume recommends streaming audio with a buffer window of `20` milliseconds (ms), or `100` milliseconds (ms) for web applications. See our [Audio Guide](/docs/speech-to-speech-evi/guides/audio) for more details on preparing and processing audio.
OR
SessionSettingsobjectRequired
**Settings for this chat session.** Session settings are temporary and apply only to the current Chat session. These settings can be adjusted dynamically based on the requirements of each session to ensure optimal performance and user experience. See our [Session Settings Guide](/docs/speech-to-speech-evi/configuration/session-settings) for a complete list of configurable settings.
OR
UserInputobjectRequired

User text to insert into the conversation. Text sent through a User Input message is treated as the user’s speech to EVI. EVI processes this input and provides a corresponding response.

Expression measurement results are not available for User Input messages, as the prosody model relies on audio input and cannot process text alone.

OR
AssistantInputobjectRequired
**Assistant text to synthesize into spoken audio and insert into the conversation.** EVI uses this text to generate spoken audio using our proprietary expressive text-to-speech model. Our model adds appropriate emotional inflections and tones to the text based on the user's expressions and the context of the conversation. The synthesized audio is streamed back to the user as an Assistant Message.
OR
ToolResponseMessageobjectRequired
**Return value of the tool call.** Contains the output generated by the tool to pass back to EVI. Upon receiving a Tool Call message and successfully invoking the function, this message is sent to convey the result of the function call back to EVI. For built-in tools implemented on the server, you will receive this message type rather than a `ToolCallMessage`. See our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for further details.
OR
ToolErrorMessageobjectRequired
**Error message from the tool call**, not exposed to the LLM or user. Upon receiving a Tool Call message and failing to invoke the function, this message is sent to notify EVI of the tool's failure. For built-in tools implemented on the server, you will receive this message type rather than a `ToolCallMessage` if the tool fails. See our [Tool Use Guide](/docs/speech-to-speech-evi/features/tool-use) for further details.
OR
PauseAssistantMessageobjectRequired
**Pause responses from EVI.** Chat history is still saved and sent after resuming. Once this message is sent, EVI will not respond until a Resume Assistant message is sent. When paused, EVI won't respond, but transcriptions of your audio inputs will still be recorded. See our [Pause Response Guide](/docs/speech-to-speech-evi/features/pause-responses) for further details.
OR
ResumeAssistantMessageobjectRequired
**Resume responses from EVI.** Chat history sent while paused will now be sent. Upon resuming, if any audio input was sent during the pause, EVI will retain context from all messages sent but only respond to the last user message. See our [Pause Response Guide](/docs/speech-to-speech-evi/features/pause-responses) for further details.

Receive

SubscribeEventobjectRequired