EVI TypeScript Quickstart
A quickstart guide for integrating the Empathic Voice Interface (EVI) with TypeScript.
In this guide, you’ll learn how to integrate EVI into your TypeScript applications using Hume’s TypeScript SDK.
- Installation: Install the hume package.
- Authentication: Instantiate the Hume client using your API credentials.
- Connection: Initialize a WebSocket connection to interact with EVI.
- Audio capture: Capture and stream audio input.
- Audio playback: Play back EVI’s streamed audio output.
- Interruption: Handle user interruptions client-side.
See the complete implementation of this guide on GitHub
Explore or contribute to Hume’s TypeScript SDK on GitHub
This guide primarily targets web browser implementations. For non-browser environments (e.g., Node.js), audio capture and playback implementation will vary based on your runtime context.
Installation
Install the Hume TypeScript SDK package.
pnpm
npm
yarn
bun
Authentication
Instantiate the Hume client with your API credentials to authenticate. Visit our Getting your API keys page for details on how to obtain your credentials.
This example uses direct API key authentication for simplicity. For production browser environments, implement the Token authentication strategy instead to prevent exposing your API key to the client.
Load API keys from environment variables. Avoid hardcoding them in your code to prevent credential leaks and unauthorized access.
Connection
With the Hume client instantiated, establish an authenticated WebSocket connection using the client’s empathicVoice.chat.connect
method, and assign WebSocket event handlers.
Audio capture
Capture audio input from the user’s microphone and stream it to EVI over the WebSocket:
- Request microphone access from the user.
- Obtain the audio stream using the MediaStream API.
- Record audio chunks using the MediaRecorder API.
- Encode each audio chunk in base64.
- Stream encoded audio to EVI by sending audio_input
messages to Hume over WebSocket using the SDK’s
sendAudioInput
method.
Accepted audio formats include: mp3
, wav
, aac
, ogg
, flac
, webm
, avr
, cdda
,
cvs/vms
, aiff
, au
, amr
, mp2
, mp4
, ac3
, avi
, wmv
, mpeg
, ircam
.
Invoke the startAudioCapture
function within handleOpen
to start streaming audio once a
connection is established:
Update your handleClose
function to ensure audio capture stops appropriately on disconnect:
Audio playback
Handle playback of audio responses from EVI using the Hume TypeScript SDK’s EVIWebAudioPlayer.
- Initialize the audio player when the WebSocket connection opens.
- Queue audio responses received from EVI for playback.
- Dispose of the audio player when the WebSocket connection closes to release resources.
After starting audio capture, initialize the player within handleOpen
.
Update handleMessage
to enqueue received audio responses for playback:
Update handleClose
to dispose of the audio player when the WebSocket disconnects:
Interruption
When an interruption is detected, EVI will immediately stop sending further response messages and wait for the user’s new input.
The client must then explicitly handle the interruption by stopping ongoing audio playback.
To stop audio playback on user interruption, update handleMessage
to invoke EVIWebAudioPlayer.stop
when you
receive a user_interruption
message:
Next steps
Congratulations! You’ve successfully implemented a real-time conversational application using Hume’s Empathic Voice Interface (EVI). In this quickstart, you’ve learned the core aspects of authentication, WebSocket communication, audio streaming, playback handling, and interruption management.
Next, consider exploring these areas to enhance your EVI application:
See detailed instructions on how you can customize EVI for your application needs.
Learn how you can access and manage conversation transcripts and expression measures.
For further details and practical examples, explore the API Reference and our Hume API Examples on GitHub.