How to use the plugin

This guide walks you through the process of setting up Runtime MetaHuman Lip Sync for your MetaHuman characters.

Note: Runtime MetaHuman Lip Sync works with both MetaHuman and custom characters. The plugin supports various character types including:

Popular commercial characters (Daz Genesis 8/9, Reallusion CC3/CC4, Mixamo, ReadyPlayerMe, etc)

Characters with FACS-based blendshapes

Models using ARKit blendshape standards

Characters with Preston Blair phoneme sets

3ds Max phoneme systems

Any character with custom morph targets for facial expressions

For detailed instructions on setting up custom characters, including viseme mapping references for all the above standards, see the Custom character setup guide.

Prerequisites

Before getting started, ensure:

The MetaHuman plugin is enabled in your project (Note: Starting from UE 5.6, this step is no longer required as MetaHuman functionality is integrated directly into the engine)
You have at least one MetaHuman character downloaded and available in your project
The Runtime MetaHuman Lip Sync plugin is installed

Additional Plugins:

If you plan to use audio capture (e.g., microphone input), install the Runtime Audio Importer plugin.
If you plan to use text-to-speech functionality, install the Runtime Text To Speech plugin.

Platform-Specific Configuration

Android / Meta Quest Configuration

If you're targeting Android or Meta Quest platforms and encounter build errors with this plugin, you'll need to disable the x86_64 (x64) Android architecture in your project settings:

Go to Edit > Project Settings
Navigate to Platforms > Android
Under Platforms - Android, Build section, find Support x86_64 [aka x64] and ensure it's disabled, as shown below

Disable x64 Android Architecture

This is because the plugin currently only supports arm64-v8a and armeabi-v7a architectures for Android / Meta Quest platforms.

Setup Process

Step 1: Locate and modify the face animation Blueprint

UE 5.5 and Earlier (or Legacy MetaHumans in UE 5.6+)
UE 5.6+ MetaHuman Creator Characters

You need to modify an Animation Blueprint that will be used for your MetaHuman character's facial animations. The default MetaHuman face Animation Blueprint is located at:

Content/MetaHumans/Common/Face/Face_AnimBP

Face Animation Blueprint

You have several options for implementing the lip sync functionality:

Edit Default Asset (Simplest Option)
Create Duplicate
Use Custom Animation Blueprint

Open the default Face_AnimBP directly and make your modifications. Any changes will affect all MetaHuman characters using this Animation Blueprint.

Note: This approach is convenient but will impact all characters using the default Animation Blueprint.

Duplicate Face_AnimBP and give it a descriptive name
Locate your character's Blueprint class (e.g., for character "Bryan", it would be at Content/MetaHumans/Bryan/BP_Bryan)
Open the character Blueprint and find the Face component
Change the Anim Class property to your newly duplicated Animation Blueprint

Note: This approach allows you to customize lip sync for specific characters while leaving others unchanged.

You can implement the lip sync blending in any Animation Blueprint that has access to the required facial bones:

Create or use an existing custom Animation Blueprint
Ensure your Animation Blueprint works with a skeleton that contains the same facial bones as the default MetaHuman's Face_Archetype_Skeleton (which is the standard skeleton used for any MetaHuman character)

Note: This approach gives you maximum flexibility for integration with custom animation systems.

Starting with UE 5.6, the new MetaHuman Creator system was introduced, which creates characters without the traditional Face_AnimBP asset. For these characters, the plugin provides a face Animation Blueprint located at:

Content/LipSyncData/LipSync_Face_AnimBP

Using the Plugin's Face Animation Blueprint:

Locate your MetaHuman Creator character's Blueprint class
Open the character Blueprint and find the Face component
Change the Anim Class property to the plugin's LipSync_Face_AnimBP
Continue with Steps 2-4 to configure the Runtime MetaHuman Lip Sync functionality

Alternative Options:

Use Legacy Instructions: You can still follow the UE 5.5 instructions above if you're working with legacy MetaHumans or prefer the traditional workflow
Create Custom Animation Blueprint: Create your own Animation Blueprint that works with the MetaHuman Creator skeleton structure

Note: If you're using UE 5.6+ but working with legacy MetaHumans (not created through MetaHuman Creator), use the "UE 5.5 and Earlier" tab instructions instead.

Important: The Runtime MetaHuman Lip Sync blending can be implemented in any Animation Blueprint asset that has access to a pose containing the facial bones present in the default MetaHuman's Face_Archetype_Skeleton. You're not limited to the options above - these are just common implementation approaches.

Step 2: Event Graph setup

Open your Face Animation Blueprint and switch to the Event Graph. You'll need to create a generator that will process audio data and generate lip sync animation.

Standard (Faster) Model
Realistic (Higher Quality) Model

Add the Event Blueprint Begin Play node if it doesn't exist already
Add the Create Runtime Viseme Generator node and connect it to the Begin Play event
Save the output as a variable (e.g. "VisemeGenerator") for use in other parts of the graph

Creating Runtime Viseme Generator

Add the Event Blueprint Begin Play node if it doesn't exist already
Add the Create Realistic MetaHuman Lip Sync Generator node and connect it to the Begin Play event
Save the output as a variable (e.g. "RealisticLipSyncGenerator") for use in other parts of the graph

Creating Realistic Lip Sync Generator

Note: The Realistic Model is optimized specifically for MetaHuman characters and is not compatible with custom character types.

Step 3: Set up audio input processing

You need to set up a method to process audio input. There are several ways to do this depending on your audio source.

This approach performs lip sync in real-time while speaking into the microphone:

Standard (Faster) Model
Realistic (Higher Quality) Model

Create a Capturable Sound Wave using Runtime Audio Importer
Before starting to capture audio, bind to the OnPopulateAudioData delegate
In the bound function, call ProcessAudioData from your Runtime Viseme Generator
Start capturing audio from the microphone

Copyable nodes.

Lip Sync During Audio Capture

The Realistic Model uses the same audio processing workflow as the Standard Model, but with the RealisticLipSyncGenerator variable instead of VisemeGenerator.

In each of the examples shown for the Standard Model, simply replace:

VisemeGenerator with your RealisticLipSyncGenerator variable
The function names and parameters remain identical between both models

Copyable nodes.

Realistic Lip Sync During Audio Playback

This approach captures audio from a microphone, then plays it back with lip sync:

Standard (Faster) Model
Realistic (Higher Quality) Model

Create a Capturable Sound Wave using Runtime Audio Importer
Start audio capture from the microphone
Before playing back the capturable sound wave, bind to its OnGeneratePCMData delegate
In the bound function, call ProcessAudioData from your Runtime Viseme Generator

Copyable nodes.

Lip Sync During Audio Playback

The Realistic Model uses the same audio processing workflow as the Standard Model, but with the RealisticLipSyncGenerator variable instead of VisemeGenerator.

In each of the examples shown for the Standard Model, simply replace:

VisemeGenerator with your RealisticLipSyncGenerator variable
The function names and parameters remain identical between both models

Copyable nodes.

Realistic Lip Sync During Audio Playback

Note: If you want to process audio data in smaller chunks for more responsive lip sync, adjust the calculation in the SetNumSamplesPerChunk function. For example, dividing the sample rate by 150 (streaming every ~6.67 ms) instead of 100 (streaming every 10 ms) will provide more frequent lip sync updates.

Regular
Streaming

This approach synthesizes speech from text and performs lip sync:

Standard (Faster) Model
Realistic (Higher Quality) Model

Use Runtime Text To Speech to generate speech from text
Use Runtime Audio Importer to import the synthesized audio
Before playing back the imported sound wave, bind to its OnGeneratePCMData delegate
In the bound function, call ProcessAudioData from your Runtime Viseme Generator

Copyable nodes.

Lip Sync From Synthesized Speech

This approach uses streaming text-to-speech synthesis with real-time lip sync:

Standard (Faster) Model
Realistic (Higher Quality) Model

Use Runtime Text To Speech to generate streaming speech from text
Use Runtime Audio Importer to import the synthesized audio
Before playing back the streaming sound wave, bind to its OnGeneratePCMData delegate
In the bound function, call ProcessAudioData from your Runtime Viseme Generator

Copyable nodes.

Lip Sync From Synthesized Streaming Speech

Regular
Streaming

This approach uses the Runtime AI Chatbot Integrator plugin to generate synthesized speech from AI services (OpenAI or ElevenLabs) and perform lip sync:

Standard (Faster) Model
Realistic (Higher Quality) Model

Use Runtime AI Chatbot Integrator to generate speech from text using external APIs (OpenAI, ElevenLabs, etc.)
Use Runtime Audio Importer to import the synthesized audio data
Before playing back the imported sound wave, bind to its OnGeneratePCMData delegate
In the bound function, call ProcessAudioData from your Runtime Viseme Generator

Copyable nodes.

Lip Sync From Externally Synthesized Speech

The Realistic Model uses the same audio processing workflow as the Standard Model, but with the RealisticLipSyncGenerator variable instead of VisemeGenerator.

In each of the examples shown for the Standard Model, simply replace:

VisemeGenerator with your RealisticLipSyncGenerator variable
The function names and parameters remain identical between both models

Copyable nodes.

Realistic Lip Sync From Externally Synthesized Speech

This approach uses the Runtime AI Chatbot Integrator plugin to generate synthesized streaming speech from AI services (OpenAI or ElevenLabs) and perform lip sync:

Standard (Faster) Model
Realistic (Higher Quality) Model

Use Runtime AI Chatbot Integrator to connect to streaming TTS APIs (like ElevenLabs Streaming API)
Use Runtime Audio Importer to import the synthesized audio data
Before playing back the streaming sound wave, bind to its OnGeneratePCMData delegate
In the bound function, call ProcessAudioData from your Runtime Viseme Generator

Copyable nodes.

Lip Sync From Externally Synthesized Streaming Speech

The Realistic Model uses the same audio processing workflow as the Standard Model, but with the RealisticLipSyncGenerator variable instead of VisemeGenerator.

In each of the examples shown for the Standard Model, simply replace:

VisemeGenerator with your RealisticLipSyncGenerator variable
The function names and parameters remain identical between both models

Copyable nodes.

Realistic Lip Sync From Externally Synthesized Streaming Speech

This approach uses pre-recorded audio files or audio buffers for lip sync:

Standard (Faster) Model
Realistic (Higher Quality) Model

Use Runtime Audio Importer to import an audio file from disk or memory
Before playing back the imported sound wave, bind to its OnGeneratePCMData delegate
In the bound function, call ProcessAudioData from your Runtime Viseme Generator
Play the imported sound wave and observe the lip sync animation

Copyable nodes.

Lip Sync From Synthesized Speech

The Realistic Model uses the same audio processing workflow as the Standard Model, but with the RealisticLipSyncGenerator variable instead of VisemeGenerator.

In each of the examples shown for the Standard Model, simply replace:

VisemeGenerator with your RealisticLipSyncGenerator variable
The function names and parameters remain identical between both models

Copyable nodes.

Realistic Lip Sync From Synthesized Speech

For a custom audio source, you need:

Standard (Faster) Model
Realistic (Higher Quality) Model

Audio data in float PCM format (an array of floating-point samples)
The sample rate and number of channels
Call ProcessAudioData from your Runtime Viseme Generator with these parameters

Here's an example of streaming audio from a custom source:

Copyable nodes.

Lip Sync From Streaming Source

The Realistic Model uses the same audio processing workflow as the Standard Model, but with the RealisticLipSyncGenerator variable instead of VisemeGenerator.

In each of the examples shown for the Standard Model, simply replace:

VisemeGenerator with your RealisticLipSyncGenerator variable
The function names and parameters remain identical between both models

Copyable nodes.

Realistic Lip Sync From Streaming Source

Step 4: Anim Graph setup

After setting up the Event Graph, switch to the Anim Graph to connect the generator to the character's animation:

Lip Sync

Standard (Faster) Model
Realistic (Higher Quality) Model

Locate the pose that contains the MetaHuman face (typically from Use cached pose 'Body Pose')
Add the Blend Runtime MetaHuman Lip Sync node
Connect the pose to the Source Pose of the Blend Runtime MetaHuman Lip Sync node
Connect your RuntimeVisemeGenerator variable to the Viseme Generator pin
Connect the output of the Blend Runtime MetaHuman Lip Sync node to the Result pin of the Output Pose

Blend Runtime MetaHuman Lip Sync

When lip sync is detected in the audio, your character will dynamically animate accordingly:

Lip Sync

Locate the pose that contains the MetaHuman face (typically from Use cached pose 'Body Pose')
Add the Blend Realistic MetaHuman Lip Sync node
Connect the pose to the Source Pose of the Blend Realistic MetaHuman Lip Sync node
Connect your RealisticLipSyncGenerator variable to the Lip Sync Generator pin
Connect the output of the Blend Realistic MetaHuman Lip Sync node to the Result pin of the Output Pose

Blend Realistic MetaHuman Lip Sync

The Realistic Model provides enhanced visual quality with more natural mouth movements:

Realistic Lip Sync

Note: The Realistic Model is designed exclusively for MetaHuman characters and is not compatible with custom character types.

Laughter Animation

You can also add laughter animations that will dynamically respond to laughter detected in the audio:

Add the Blend Runtime MetaHuman Laughter node
Connect your RuntimeVisemeGenerator variable to the Viseme Generator pin
If you're already using lip sync:
- Connect the output from the Blend Runtime MetaHuman Lip Sync node to the Source Pose of the Blend Runtime MetaHuman Laughter node
- Connect the output of the Blend Runtime MetaHuman Laughter node to the Result pin of the Output Pose
If using only laughter without lip sync:
- Connect your source pose directly to the Source Pose of the Blend Runtime MetaHuman Laughter node
- Connect the output to the Result pin

Blend Runtime MetaHuman Laughter

When laughter is detected in the audio, your character will dynamically animate accordingly:

Laughter

Combining with Body Animations

To apply lip sync and laughter alongside existing body animations without overriding them:

Add a Layered blend per bone node between your body animations and the final output. Make sure Use Attached Parent is true.
Configure the layer setup:
- Add 1 item to the Layer Setup array
- Add 3 items to the Branch Filters for the layer, with the following Bone Names:
  - FACIAL_C_FacialRoot
  - FACIAL_C_Neck2Root
  - FACIAL_C_Neck1Root
Make the connections:
- Existing animations (such as BodyPose) → Base Pose input
- Facial animation output (from lip sync and/or laughter nodes) → Blend Poses 0 input
- Layered blend node → Final Result pose

Layered Blend Per Bone

Why this works: The branch filters isolate facial animation bones, allowing lip sync and laughter to blend exclusively with facial movements while preserving original body animations. This matches the MetaHuman facial rig structure, ensuring natural integration.

Note: The lip sync and laughter features are designed to work non-destructively with your existing animation setup. They only affect the specific facial bones needed for mouth movement, leaving other facial animations intact. This means you can safely integrate them at any point in your animation chain - either before other facial animations (allowing those animations to override lip sync/laughter) or after them (letting lip sync/laughter blend on top of your existing animations). This flexibility lets you combine lip sync and laughter with eye blinking, eyebrow movements, emotional expressions, and other facial animations without conflicts.

Configuration

Lip Sync Configuration

Standard (Faster) Model
Realistic (Higher Quality) Model

The Blend Runtime MetaHuman Lip Sync node has configuration options in its properties panel:

Property	Default	Description
Interpolation Speed	25	Controls how quickly the lip movements transition between visemes. Higher values result in faster more abrupt transitions.
Reset Time	0.2	The duration in seconds after which the lip sync is reset. This is useful to prevent the lip sync from continuing after the audio has stopped.

The Blend Realistic MetaHuman Lip Sync node has configuration options in its properties panel:

Property	Default	Description
Interpolation Speed	30	Controls how quickly the lip movements transition between positions. Higher values result in faster more abrupt transitions.
Reset Time	0.2	The duration in seconds after which the lip sync is reset. This is useful to prevent the lip sync from continuing after the audio has stopped.

Laughter Configuration

The Blend Runtime MetaHuman Laughter node has its own configuration options:

Property	Default	Description
Interpolation Speed	25	Controls how quickly the lip movements transition between laughter animations. Higher values result in faster more abrupt transitions.
Reset Time	0.2	The duration in seconds after which the laughter is reset. This is useful to prevent the laughter from continuing after the audio has stopped.
Max Laughter Weight	0.7	Scales the maximum intensity of the laughter animation (0.0 - 1.0).

Choosing Between Lip Sync Models

When deciding which lip sync model to use for your project, consider these factors:

Consideration	Standard Model	Realistic Model
Character Compatibility	MetaHumans and all custom character types	MetaHumans only
Visual Quality	Good lip sync with efficient performance	Enhanced realism with more natural mouth movements
Performance	Optimized for all platforms including mobile/VR	Slightly higher resource requirements
Use Cases	General applications, games, VR/AR, mobile	Cinematic experiences, close-up character interactions

For most projects, the Standard Model provides an excellent balance of quality and performance while supporting the widest range of character types. The Realistic Model is ideal when you need the highest visual fidelity specifically for MetaHuman characters in contexts where performance overhead is less critical.

Prerequisites​

Platform-Specific Configuration​

Android / Meta Quest Configuration​

Setup Process​

Step 1: Locate and modify the face animation Blueprint​

Step 2: Event Graph setup​

Step 3: Set up audio input processing​

Step 4: Anim Graph setup​

Lip Sync​

Laughter Animation​

Combining with Body Animations​

Configuration​

Lip Sync Configuration​

Laughter Configuration​

Choosing Between Lip Sync Models​

Prerequisites

Platform-Specific Configuration

Android / Meta Quest Configuration

Setup Process

Step 1: Locate and modify the face animation Blueprint

Step 2: Event Graph setup

Step 3: Set up audio input processing

Step 4: Anim Graph setup

Lip Sync

Laughter Animation

Combining with Body Animations

Configuration

Lip Sync Configuration

Laughter Configuration

Choosing Between Lip Sync Models