Rate this page:

Realtime speech synthesis

In addition to traditional text-to-speech synthesis, when you input a specific text and select a voice from a predefined list to synthesize speech, some providers offer a unique feature called realtime speech synthesis.

This means that the synthesis process occurs in a streaming manner, as the source text is continuously updated. When dealing with multiple sources, such as Large Language Models (LLM) like ChatGPT, which provide text in chunks, realtime speech synthesis becomes particularly advantageous.

In this article, we provide an example how to use realtime speech synthesis from the ElevenLabs and Cartesia providers, which offer a range of realtime voice options.

ElevenLabs example

Copy URL

To use the ElevenLabs realtime speech synthesis, require the Modules.ElevenLabs module from VoxEngine in your scenario.

Use the ElevenLabs.createRealtimeTTSPlayer method to create a realtime TTS player and provide desired parameters. Use the *.sendMedia or VoxEngine.sendMediaBetween methods to send media between Call and ElevenLabs.RealtimeTTSPlayer.

Listen to the ElevenLabs.RealtimeTTSPlayer events and implement desired application business logic.

Here is the complete scenario for your reference.

ElevenLabs Realtime TTS

ElevenLabs Realtime TTS

Custom API keys

If you have your personal API key for the ElevenLabs provider, you may want to specify it in your scenario. Specifying the API key can help to avoid certain restrictions and personalize your application.

For the ElevenLabs provider, specify the custom API key in the headers parameter. Refer to the scenario below:

Custom API key for ElevenLabs

Custom API key for ElevenLabs

Cartesia example

Copy URL

Using the Cartesia provider is very similar. First, require the Modules.Cartesia module from VoxEngine in your scenario.

Use the Cartesia.createRealtimeTTSPlayer method to create a realtime TTS player and provide desired parameters.

Listen to the Cartesia.RealtimeTTSPlayer events and implement desired application business logic.

Here's the scenario example for the Cartesia provider:

Cartesia Realtime TTS

Cartesia Realtime TTS

Custom API keys

If you have your personal API key for the Cartesia provider, you may want to specify it in your scenario. Specifying the API key can help to avoid certain restrictions and personalize your application.

For the Cartesia provider, specify the custom API key in the cartesiaRealtimeTTSPlayerParameters object. Refer to the scenario below:

Custom API key for Cartesia

Custom API key for Cartesia