The description field for acting instructions is available for Octave 1 only. Support for description with Octave 2 is coming soon. The speed and trailing_silence fields are supported in all models.
Octave supports supplying acting instructions to guide aspects of speech delivery:
In the following section, we’ll explore the ways in which you can provide acting instructions to Octave through the API.
The TTS API offers parameters which allow you to control how an individual utterance is performed. These parameters can be used individually or combined for precise control over speech output:
0.5 (much slower) to 2.0 (much faster), where 1.0 represents
normal speaking pace. Note that changes are not proportional to the value provided - for example, setting speed to
2.0 will make speech faster but not exactly twice as fast as the default.In this section we’ll leverage acting instructions to guide Octave’s speech output for guided meditation.
Before we apply acting instructions, let’s first take a look at a request that does not contain any acting instructions:
Without acting instructions, Octave will infer how to deliver the speech from the base voice’s description and the provided text input.
In the following steps, we’ll iteratively improve Octave’s delivery by specifying different types of acting instructions to better simulate guided meditation.
Let’s begin by providing a description to guide the delivery of these utterances to be calmer and more
instructive:
When you don’t specify a voice, the description field serves as a voice prompt for creating a new voice. See our Voice Design guide for details.
While the descriptions help to make the voice sound more appropriate for our use case, we now want to adjust the speed of delivery to be slower to create an atmosphere better suited for meditation:
Finally, in this guided meditation, it would be helpful to give the participants some time to actually take a breath! To achieve this we can introduce a pause between utterances by specifying a trailing silence duration for the first utterance.
To inject natural breaks within an utterance, try using [pause] or [long pause] in your text. Example:
“Haha [pause] I didn’t realize this was going to be a formal event.”
Combine natural language descriptions, speed adjustments, and pauses to control Octave’s delivery. In the meditation example, these settings turn a simple line into naturally paced speech. Tune these controls together to match your intended delivery.
speed for adjusting speech rate: Rather than using the description field to instruct slower or faster speech, leverage the speed parameter.The table below demonstrates how acting instructions can transform the same text into different delivery styles: