Prompt engineering for empathic voice interfaces
System prompts shape the behavior, responses, and style of your custom empathic voice interface (EVI).
Creating an effective system prompt is an essential part of customizing an EVI’s behavior. For the most part, prompting EVI is the same as prompting any LLM, but there are some important differences. Prompting for EVIs is different for two main reasons:
- Prompts are for a voice-only interaction with the user rather than a text-based chat.
- EVIs can respond to the user’s emotional expressions in their tone of voice and not just the text content of their messages.
While EVI generates longer responses using a large frontier model, Hume uses a smaller empathic large language model (eLLM) to quickly generate an initial empathic, conversational response. This eLLM eliminates the usual awkward pause while the larger LLM generates its response, providing a more natural conversational flow. Your system prompt is both used by EVI and passed along to the LLM you select.
Using the following guidelines for prompt engineering allows developers to customize EVI’s response style for any use case, from voice AIs for mental health support to customer service agents.
The system prompt is a powerful and flexible way to guide the AI’s responses, but it cannot dictate the AI’s responses with absolute precision. Careful prompt design and testing will help EVI hold the kinds of conversations you’re looking for. If you need more control over EVI’s responses, try using our custom language model feature for complete control of the text generation.
EVI-specific prompting instructions
The instructions below are specific to prompting empathic voice interfaces.
Prompt for voice-only conversations
As LLMs are trained for primarily text-based interactions, providing guidelines on how to engage with the user with voice makes conversations feel much more fluid and natural. For example, you may prompt the AI to use natural, conversational language. For example, see the instruction below:
If you find the default behavior of the LLM acceptable, then you may only need a very short system prompt. Customizing the LLM’s behavior more and maintaining consistency in longer and more varied conversations often requires lengthening the prompt.
Expressive prompt engineering
Expressive prompt engineering is Hume’s term for techniques that embed emotional expression measures into conversations to allow language models to respond effectively to the user’s expressions. Hume’s EVI uses our expression measurement models to measure the user’s expressions in their tone of voice. You can use the system prompt to guide how the AI voice responds to these non-verbal cues. EVI measures these expressions in real time and converts them into text-based descriptions to help the LLM understand not just what the user said, but how they said it. EVI detects 48 distinct expressions in the user’s voice and ranks these expressions by our model’s confidence that they are present in the user’s speech. Then, we append text descriptions of the top 3 expressions to the end of each User message
to communicate the user’s tone of voice to the LLM.
For example, our demo uses an instruction like the one below to help EVI respond to expressions:
Explain to the LLM exactly how you want it to respond to these expressions and how to use them in the conversation. For example, you may want it to ignore expressions unless the user is angry, or to have particular responses to expressions like doubt or confusion. You can also instruct EVI to detect and respond to mismatches between the user’s tone of voice and the text content of their speech:
EVI is designed for empathic conversations, and you can use expressive prompt engineering to customize how EVI empathizes with the user’s expressions for your use case.
Continue from short response model
We use our eLLM (empathic large language) to rapidly generate short, empathic responses in the conversation before your LLM has finished generating a response. After the eLLM’s response, we send a User message
with the text [continue]
to inform the LLM that it should be continuing from the short response. To help the short response and longer response blend seamlessly together, it is important to use an instruction like the one below:
For almost all use cases, you can simply append this exact instruction to the end of your prompt to help the larger LLM continue from the short response.
Using dynamic variables in your prompt
Dynamic variables are values which can change during a conversation with EVI.
In order to function, dynamic variables must be manually defined within a chat’s session settings. To learn how to do so, visit our Configuration page.
Embedding dynamic variables into your system prompt can help personalize the user experience to reflect user-specific or changing information such as names, preferences, the current date, and other details.
In other words, dynamic variables may be used to customize EVI conversations with specific context for each user and each conversation. For example, you can adjust your system prompt to include conversation-specific information, such as a user’s favorite color or travel plans:
Using a website as EVI’s knowledge base
Web Search is a built-in tool that allows EVI to search the web for up-to-date information. However, instead of searching the entire web, you can configure EVI to search within a single website using a system prompt.
Constraining EVI’s knowledge to a specific website can enable the creation of domain-specific chatbots. For example, you could use this approach to create documentation assistants or product-specific support bots. By leveraging existing web content, it provides a quick alternative to full RAG implementations while still offering targeted information retrieval.
To use a website as EVI’s knowledge base, follow these steps:
-
Enable Web Search: Before you begin, ensure Web Search is enabled as a built-in tool in your EVI configuration. For detailed instructions, visit our Tool Use page.
-
Include a Web Search instruction: In your EVI configuration, modify the system prompt to include a
use_web_search
instruction. -
Specify a target domain: In the instruction, specify that
site:<target_domain>
be appended to all search queries, where the<target_domain>
is the URL of the website you’d like EVI to focus on. For example, you can create a documentation assistant using an instruction like the one below:
General LLM prompting guidelines
Prompting best practices
General prompt engineering best practices also apply to EVIs. For example, ensure your prompts are clear, detailed, direct, and specific. Include necessary instructions and examples in the EVI’s system prompt to set expectations for the LLM. Define the context of the conversation, EVI’s role, personality, tone, greeting style, and any other guidelines for its responses.
For example, to limit the length of the LLM’s responses, you may use a clear instruction like this:
Try to focus on telling the model what it should do (positive reinforcement) rather than what it shouldn’t do (negative reinforcement). LLMs have a harder time consistently avoiding behaviors, and adding them to the prompt may even promote those undesired behaviors.
Understand your LLM’s capabilities
Different LLMs have varying capabilities, limitations, and context windows. More advanced LLMs can handle longer, nuanced prompts, but are often slower and pricier. Simpler LLMs are faster and cheaper but require shorter, less complex prompts with fewer instructions and less nuance. Some LLMs also have longer context windows - the number of tokens the model can process while generating a response, acting essentially as the model’s memory. Tailor your prompt length to fit within the LLM’s context window to ensure the model can use the full conversation history.
Use sections to divide your prompt
Separating your prompt into titled sections can help the model distinguish between different instructions and follow the prompt more reliably. The recommended format for these sections differs between language model providers. For example, OpenAI models often respond best to markdown sections (like ## Role
), while Anthropic models respond well to XML tags (like <role> </role>
). For example:
For Claude models, you may wrap your instructions in tags like <role>
, <personality>
, <response_style>
, <response_format>
, <examples>
, <respond_to_expressions>
, or <stay_concise>
to structure your prompt. This format is not required, but it can improve the LLM’s ability to interpret and consistently follow the system prompt. At the end of your prompt, you may also want to remind the LLM of all of the key instructions in a <conclusion>
section.
Give few-shot examples
Use examples to show the LLM how it should respond, which is a technique known as few-shot learning. Including several specific, concrete examples of ideal interactions that follow your guidelines is one of the most effective ways to improve responses. Use diverse, excellent examples that cover different edge cases and behaviors to reinforce your instructions. Structure these examples as messages, following the format for chat-tuned LLMs. For example:
If you notice that your EVI is consistently failing to follow the prompt in certain situations, try providing examples that show how it should ideally respond in those situations.
Test your prompts
Crafting an effective system prompt to create the conversations you’re looking for often requires several iterations—cycles of changing and testing the prompt, seeing if it produces the conversations you want, and improving it over time. It is often best to start with ten to twenty gold-standard examples of excellent conversations, then test the system prompt for each of these examples after you make major changes. You can also try having voice conversations with your EVI (in the playground) to see if its responses match your expectations or are at least as good as your examples. If not, then try changing one part of the prompt at a time and then re-testing to make sure your changes are improving performance.
Additional resources
To learn more about prompt engineering in general or to understand how to prompt different LLMs, please refer to these resources:
- Hume EVI playground: Test out your system prompts in live conversations with EVI, and see how it responds differently when you change configuration options.
- OpenAI tokenizer: useful for counting the number of tokens in a system prompt for OpenAI models, which use the same tokenizer (tiktoken).
- OpenAI prompt engineering guidelines: for prompting OpenAI models like GPT-4.
- OpenAI playground: for testing OpenAI prompts in a chat interface.
- Anthropic prompt engineering guidelines: for prompting Anthropic models like Claude 3 Haiku
- Anthropic console: for testing Anthropic prompts in a chat interface.
- Fireworks model playground: for testing out open-source models served on Fireworks.
- Vercel AI playground: Try multiple prompts and LLMs in parallel to compare their responses.
- Perplexity Labs: Try different models, including open-source LLMs, to evaluate their responses and their latency.
- Prompt engineering guide: an open-source guide from DAIR.ai with general methods and advanced techniques for prompting a wide variety of LLMs.