Interruptibility
Guide to EVI’s interruptibility feature and how to manage interruptions on the client.
Interruptibility is a key feature of EVI, allowing seamless, real-time interactions even when the user interjects mid-response. EVI handles interruptions on the backend (stopping response generation) and supports interruption on the frontend (managing audio playback) to maintain a natural conversation flow.
EVI stops generating audio when interrupted, but you are responsible for stopping playback of any audio already received on the client side to ensure a seamless, responsive experience.
How interruption works
EVI sends responses in chunks as assistant_messages,
each accompanied by corresponding audio_output
messages. The assistant messages contain both the content and expression measurement predictions, while the
audio_output
messages contain the generated audio. Once EVI completes generating a response, it sends an
assistant_end message to
indicate that the response is finished.
When a user message is detected during response generation, EVI stops generating the current response and sends a user_interrupt message to signal this event. This user_interrupt message instructs the client to halt audio playback, clear any remaining audio in the queue, and prepare for new input from the user.
Handling interruptions on the client side
While backend interruptions are managed by EVI, frontend interruptions—specifically stopping audio playback—require
client-side handling. Both user_interruption
messages (during response generation) and user_message
events
(after the response is complete) should trigger the client to stop audio playback for the previous response.
To handle interruptions consistently, the client should perform the following actions whenever a user_interruption
or user_message
is received:
- Stop audio playback: Immediately halt playback of any ongoing audio from the previous response.
- Clear queued audio: Remove any remaining audio segments in the queue to prevent overlap with new responses.
This approach ensures that any user interaction interrupts audio playback as expected, maintaining a natural flow by promptly responding to new user input.