To get started quickly, please see the custom language model example in our example GitHub repository.
The custom language model (CLM) feature allows you to use your own language model to drive EVI’s responses. When you configure a custom language model, EVI will send requests to your server with textual conversation history and emotional context. Your server is responsible for responding with the text that EVI should speak next.
A custom language model can be:
CLMs are appropriate for use cases that involve deep configurability, for example:
You should prefer using context injection instead of a CLM for use cases that do not require deep configurability. When Hume connects to an upstream LLM provider directly, it covers the cost of usage, and this results in less latency compared to if Hume connects to your CLM which connects to an upstream LLM provider.
First, create a new config, or update an existing config and select the “custom language model” option in the “Set up LLM” step. Type in the URL of your custom language model endpoint. If you are using the SSE interface (recommended), the URL should start with https:// and end with /chat/completions. If you are using websockets, the URL should start with wss://. For SSE endpoints, you can optionally set the “Custom language model identifier” if your endpoint requires a specific model to decide behavior (e.g., if your backend expects an identifier like “gpt-5-mini”). The endpoint needs to be accessible from the public internet. If you are developing locally, you can use a service like ngrok to give your local server a publicly accessible URL.
The recommended way to set up a CLM is to expose an POST /chat/completions endpoint that responds with a stream of Server-Sent Events (SSEs) in a format compatible with OpenAI’s POST /v1/chat/completions endpoint
Please reference the project in our examples repository for a runnable example.
Server-Sent Events describe a type of HTTP response that conforms to a certain web standard where
Content-Type: text/event-stream header.Content-Length header, as the length of the entire response is not known in advance.Because EVI expects the events to be in the same format as OpenAI’s chat completions, it is straightforward to a build a CLM that simply “wraps” an OpenAI model with preprocessing or postprocessing logic. More effort is required to build a CLM to wrap a model from a different provider: you will have to convert the output of your model to the OpenAI format.
The following example shows how to build a CLM by “wrapping” an upstream LLM provided by OpenAI. The steps are:
/chat/completions.role and content fields from each message in the message history. (Hume also supplies prosody information and other metadata. In this example, we simply discard that information, but you might attempt to reflect it by adding or modifying the messages you pass upstream.)POST /chat/completions endpoint, passing in the message history and "stream": true.To verify that you have successfully implemented an OpenAI-compatible POST /chat/completions endpoint, you can use the OpenAI SDK but pointed at your server, not api.openai.com. Below is an example verification script (assumes your server is running on localhost:8000):
If your SSE endpoint requires an API key, send it in the language_model_api_key message using a session_settings message when a session begins:
This will cause a header Authorization: Bearer <your-secret-key-here> to be sent as a request header.
We recommend using the SSE interface for your CLM. SSEs are simpler, allow for better security, and have better latency properties. In the past, the WebSocket interface was the only option, so the instructions are preserved here.
Please reference the project in our examples repository for a runnable example.
To use a CLM with WebSockets, the steps are:
Use the web interface or the /v0/evi/configs API to create a configuration. Select “custom language model” and provide the URL of your WebSocket endpoint. If you are developing locally, you can use a service like ngrok to give your local server a publicly accessible URL.
Next, your frontend (or Twilio, if you are using the inbound phone calling endpoint) will connect to EVI via the /v0/evi/chat endpoint, with config_id of that configuration.
EVI will open a WebSocket connection to your server, via the URL you provided when setting up the configuration. This connection is the CLM socket (as opposed to the Chat socket that is already open between the client and EVI).
As the user interacts with EVI, EVI will send messages over the CLM socket to your server, containing the conversation history and emotional context.
Your server is responsible for sending two types of message back over the CLM socket to EVI:
assistant_input messages containing text to speak, andassistant_end messages to indicate when the AI has finished responding, yielding the conversational turn back to the user.You can send multiple assistant_input payloads consecutively to stream text to the assistant. Once you are done sending inputs, you must send an assistant_end payload to indicate the end of your turn.
For managing conversational state and connecting your frontend experiences with your backend data and logic, you should set a custom_session_id for the chat.
Using a custom_session_id will enable you to:
There are two ways to set a custom_session_id:
/chat WebSocket endpoint, you can send a session_settings message over the WebSocket with the custom_session_id field set.custom_session_id as a system_fingerprint on the ChatCompletion type within the message events. With WebSockets, you can include the custom_session_id on the assistant_input message. Use this option if you don’t have control over the WebSocket connection to the client (for example, if you are using the /v0/evi/twilio endpoint for inbound phone calling).You only need to set the custom_session_id once per chat. EVI will remember the custom_session_id for the duration of the conversation.
After you set the custom_session_id, for SSE endpoints, the custom_session_id will be sent as a query parameter to your endpoint. For example POST https://api.example.com/chat/completions?custom_session_id=123. For WebSocket endpoints, the custom_session_id will be included as a top-level property on the incoming message.
If you are sourcing your CLM responses from OpenAI, be careful not to inadvertently override your intended custom_session_id with OpenAI’s system_fingerprint. If you are setting your own custom_session_id, you should always either delete system_fingerprint from OpenAI messages before forwarding them to EVI, or override them with the desired custom_session_id.