Skip to main content

How to use the plugin

The Runtime Text To Speech plugin synthesizes text into speech using downloadable voice models. These models are managed in the plugin settings within the editor, downloaded, and packaged for runtime use. Follow the steps below to get started.

Editor side

Download the appropriate voice models for your project as described here. You can download multiple voice models at the same time.

Runtime side

Create the synthesizer using the CreateRuntimeTextToSpeech function. Ensure you maintain a reference to it (e.g. as a separate variable in Blueprints or UPROPERTY in C++) to prevent it from being garbage collected. Once the synthesizer is created, you can call either of the following functions to synthesize text:

  • Text To Speech (By Object) (TextToSpeechByObject in C++)
  • Text To Speech (By Name) (TextToSpeechByName in C++)

By Name

The Text To Speech (By Name) (TextToSpeechByName in C++) function is more convenient in Blueprints starting from UE 5.4. It allows you to select voice models from a dropdown list of the downloaded models. In UE versions below 5.3, this dropdown doesn't appear, so if you're using an older version, you'll need to manually iterate over the array of voice models returned by GetDownloadedVoiceModels to select the one you need.

By Object

The Text To Speech (By Object) (TextToSpeechByObject in C++) function works across all versions of Unreal Engine but presents the voice models as a dropdown list of asset references, which is less intuitive. This method is ideal for UE 5.3 and earlier, or if your project requires a direct reference to a voice model asset for any reason.

If you've downloaded the models but can't see them, open the Voice Model dropdown, click the settings (gear icon), and enable both Show Plugin Content and Show Engine Content to make the models visible.

Playback

The On Speech Result delegate provides the synthesized audio as PCM data in float format (as a byte array in Blueprints or TArray<uint8> in C++), along with the SampleRate and NumOfChannels. You can process this data as needed.

For playback, it's recommended to use the Runtime Audio Importer plugin, which converts raw audio data into a playable sound wave. To do this, use the Import Audio From RAW Buffer function and pass the RAW Buffer (from the On Speech Result delegate) along with the SampleRate and NumOfChannels. This will generate a sound wave from the synthesized audio.

Once you have the sound wave, you can use it like any regular sound wave in Unreal Engine - playing it back, saving it to a save game file, or further processing it.

Here's an example of how the Blueprint nodes for synthesizing text and playing the audio might look (Copyable nodes):

The Runtime Audio Importer plugin also provides additional features like exporting audio data to a file, passing it to SoundCue, MetaSound, and more. For further details, check out the Runtime Audio Importer documentation.