Skip to main content

How to use the plugin

The Runtime Text To Speech plugin synthesizes text into speech using downloadable voice models. These models are managed in the plugin settings within the editor, downloaded, and packaged for runtime use. Follow the steps below to get started.

Editor side

Download the appropriate voice models for your project as described here. You can download multiple voice models at the same time.

Runtime side

Create the synthesizer using the CreateRuntimeTextToSpeech function. Ensure you maintain a reference to it (e.g. as a separate variable in Blueprints or UPROPERTY in C++) to prevent it from being garbage collected.

An example of creating a Runtime Text To Speech synthesizer in Blueprints

Once the synthesizer is created, you can call either of the following functions to synthesize text:

  • Text To Speech (By Name) (TextToSpeechByName in C++)
  • Text To Speech (By Object) (TextToSpeechByObject in C++)

By Name

The Text To Speech (By Name) function is more convenient in Blueprints starting from UE 5.4. It allows you to select voice models from a dropdown list of the downloaded models. In UE versions below 5.3, this dropdown doesn't appear, so if you're using an older version, you'll need to manually iterate over the array of voice models returned by GetDownloadedVoiceModels to select the one you need.

An example of using Text To Speech by Name in Blueprints

By Object

The Text To Speech (By Object) function works across all versions of Unreal Engine but presents the voice models as a dropdown list of asset references, which is less intuitive. This method is ideal for UE 5.3 and earlier, or if your project requires a direct reference to a voice model asset for any reason.

An example of using Text To Speech by Object in Blueprints

If you've downloaded the models but can't see them, open the Voice Model dropdown, click the settings (gear icon), and enable both Show Plugin Content and Show Engine Content to make the models visible.

Playback

The On Speech Result delegate provides the synthesized audio as PCM data in float format (as a byte array in Blueprints or TArray<uint8> in C++), along with the Sample Rate and Num Of Channels. You can process this data as needed.

For playback, it's recommended to use the Runtime Audio Importer plugin to convert raw audio data into a playable sound wave. However, this is optional, and you can handle raw PCM audio data with your own solution if desired.

To use Runtime Audio Importer, call Import Audio From RAW Buffer with the RAW Buffer (from the On Speech Result delegate), Sample Rate, and Num Of Channels to generate a sound wave.

Once you have the sound wave, you can use it like any regular sound wave in Unreal Engine - playing it back, saving it to a save game file, or further processing it.

Here's an example of how the Blueprint nodes for synthesizing text and playing the audio might look (Copyable nodes):

The Runtime Audio Importer plugin also provides additional features like exporting audio data to a file, passing it to SoundCue, MetaSound, and more. For further details, check out the Runtime Audio Importer documentation.