Text-to-speech (Json)
Synthesizes one or more input texts into speech using the specified voice. If no voice is provided, a novel voice will be generated dynamically. Optionally, additional context can be included to influence the speech’s style and prosody.
The response includes the base64-encoded audio and metadata in JSON format.
Headers
Request
If enabled, the audio for all the chunks of a generation, once concatenated together, will constitute a single audio file. Otherwise, if disabled, each chunk’s audio will be its own audio file, each with its own headers (if applicable).
Response
A unique ID associated with this request for tracking and troubleshooting. Use this ID when contacting support for troubleshooting assistance.