Skip to main content

Overview

Runtime MetaHuman Lip Sync Documentation

Runtime MetaHuman Lip Sync is a plugin that enables real-time, offline, and cross-platform lip sync for both MetaHuman and custom characters. It allows you to animate a character's lips in response to audio input from various sources, including:

The plugin internally generates visemes (visual representations of phonemes) based on the audio input. Since it works directly with audio data rather than text, the plugin supports multilingual input including but not limited to English, Spanish, French, German, Japanese, Chinese, Korean, Russian, Italian, Portuguese, Arabic, and Hindi. Literally any language is supported as the lip sync is generated from audio phonemes rather than language-specific text processing.

The Standard Model produces 14 visemes and performs lip sync animation using a predefined pose asset. In contrast, the Realistic Models (exclusive to MetaHuman characters) generate 81 facial control changes without relying on a predefined pose asset, resulting in significantly more realistic facial animations.

Character Compatibility

Despite its name, Runtime MetaHuman Lip Sync works with a wide range of characters beyond just MetaHumans:

  • Daz Genesis 8/9 characters
  • Reallusion Character Creator 3/4 (CC3/CC4) characters
  • Mixamo characters
  • ReadyPlayerMe avatars

Animation Standards Support

  • FACS-based blendshape systems
  • Apple ARKit blendshape standard
  • Preston Blair phoneme sets
  • 3ds Max phoneme systems
  • Any character with custom morph targets for facial expressions

For detailed instructions on using the plugin with non-MetaHuman characters, see the Custom Character Setup Guide.

Animation Preview

Check out these short animations to see the quality of lip sync animation produced by the plugin across different character types and models:

Realistic Lip Sync Example
Realistic model with MetaHuman character
Standard Lip Sync Example
Standard model with MetaHuman character
Custom Character Lip Sync Example
Standard model with custom character
Custom Character Lip Sync Example
Standard model with custom character

Key Features

  • Real-time lip sync from microphone input
  • Offline audio processing support
  • Cross-platform compatibility with model-specific platform support
  • Support for multiple character systems and animation standards
  • Flexible viseme mapping for custom characters
  • Universal language support - works with any spoken language through audio analysis
  • Mood-aware facial animation for enhanced expressiveness
  • Configurable output types (full face or mouth-only controls)

Lip Sync Models

The plugin offers multiple lip sync models to suit different project needs:

The standard lip sync model provides efficient, cross-platform performance with broad character compatibility:

  • Works with MetaHumans and all custom character types
  • Optimized for real-time performance
  • Lower resource requirements
  • Full compatibility with local TTS (Runtime Text To Speech plugin)
  • Platform Support: Windows, Android, Android-based platforms (including Meta Quest)
  • Three optimization levels: Original, Semi-Optimized, and Highly Optimized
Extension Plugin Required

To use the Standard Model, you need to install an additional extension plugin. See the Prerequisites section for installation instructions.

You can choose the appropriate model based on your project requirements for performance, character compatibility, visual quality, target platform, and feature needs.

TTS Compatibility Note

While all models support various audio input methods, the regular Realistic model has limited compatibility with local TTS due to ONNX runtime conflicts. The Mood-Enabled Realistic model, however, is fully compatible with local TTS. For text-to-speech functionality:

  • Standard Model: Compatible with all TTS options (local and external)
  • Realistic Model: External TTS services recommended (OpenAI, ElevenLabs)
  • Mood-Enabled Realistic Model: Compatible with all TTS options (local and external)

How It Works

The plugin processes audio input in the following way:

  1. Audio data is received as float PCM format with specified channels and sample rate
  2. The plugin processes the audio to generate facial control data or visemes depending on the model
  3. For mood-enabled models, emotional context is applied to the facial animation
  4. The animation data drives the character's facial movements in real-time

Quick Start

Here's a basic setup for enabling lip sync on your character:

  1. For MetaHuman characters, follow the Setup Guide
  2. For custom characters, follow the Custom Character Setup Guide
  3. Choose and configure your preferred lip sync model
  4. Set up audio input processing in your Blueprint
  5. Connect the appropriate lip sync node in the Animation Blueprint
  6. Play audio and see your character speak with emotion!

Additional Resources

🎥 Video Tutorials

Realistic Model (High-Quality) Tutorials:

Standard Model Tutorials:

General Setup:

💬 Support