Skip to main content

How to improve performance

Windows platforms use Vulkan for GPU acceleration, which significantly speeds up the recognition process. On other platforms, the plugin uses the CPU + intrinsics for acceleration. However, you can further improve the performance of the plugin by following the recommendations below:

  1. Decrease Step Size

    By default, the step size is 5000 ms (5 seconds), meaning the audio data is recognized every 5 seconds during capture. If you want to recognize the audio data more frequently, you can decrease the step size, such as to 500 ms (0.5 seconds).

  2. Use a Smaller Language Model

    You can consider using a smaller language model, such as Tiny Quantized (Q5_1), to reduce the model size and improve performance. Instructions on how to select a language model can be found here.

  3. Alter CPU Instruction Sets

    The underlying library used in the plugin is whisper.cpp, which uses CPU instruction sets to increase the performance of the recognition. Currently, the instruction sets are hard-coded in the code in the plugin and defined by approximation/probability of having them depending on various macros, due to UE limitations for passing the compiler flags. You can manually modify the SpeechRecognizerPrivate.h file to define the instruction sets that are supported by your target platform. Here is the list of currently used instruction sets by whisper.cpp, which you can define manually in the SpeechRecognizerPrivate.h file:

    • AVX and AVX2 Family:

      • __AVX__
      • __AVXVNNI__
      • __AVX2__
      • __AVX512F__
      • __AVX512VBMI__
      • __AVX512VNNI__
      • __AVX512BF16__
    • Floating-Point and SIMD Extensions:

      • __FMA__
      • __F16C__
      • __SSE3__
      • __SSSE3__
    • ARM Architecture Extensions:

      • __ARM_NEON
      • __ARM_FEATURE_SVE
      • __ARM_FEATURE_FMA
      • __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
      • __ARM_FEATURE_MATMUL_INT8
    • POWER Architecture Extensions:

      • __POWER9_VECTOR__
  4. Use Acceleration Libraries

    whisper.cpp can accelerate the recognition process by using the following libraries: Core ML for Apple Silicon devices, OpenVINO on devices including x86 CPUs and Intel GPUs, Nvidia GPU Cuda on Windows or Linux, BLAS CPU support via OpenBLAS, BLAS CPU support via Intel MKL. Please note that these libraries are not included in the plugin by default and you need to install them manually, following the whisper.cpp instructions.