Minimizing freezes

This guide addresses two common sources of freezing in the RuntimeSpeechRecognizer plugin and provides practical solutions to mitigate the performance impact.

Capturable sound wave

When starting the capturable sound wave using the StartCapture function, you may encounter a brief delay, which is engine-specific and currently unavoidable without modification of engine-specific code. The duration of this delay varies across platforms, as it is linked to platform-specific code execution for audio data retrieval from the input device (microphone). This delay has been observed on platforms like Windows, Mac, Android, and iOS and may be relevant to others.

To minimize this hitch, consider starting the capture by calling the StartCapture function at a time when its impact is negligible, such as during a loading screen. Immediately after calling StartCapture, use the ToggleMute function with the Mute parameter set to True. And when you are ready to start capturing audio data, activate the capturable sound wave by calling ToggleMute with the Mute parameter set to False. You can see more relevant info here.

Start speech recognition

When you trigger speech recognition by calling StartSpeechRecognition, a slight freeze may occur due to the engine's loading of the language model asset. Despite this process being designed to be asynchronous and executed on a separate thread, the engine still internally performs certain operations on the game thread, leading to noticeable slight lag, especially with large assets like language models.

To eliminate this freeze, adhere to the same principle as described above: call StartSpeechRecognition at a time when the lag's impact is justifiable or negligible, such as during a loading screen.