), and they recently released Hallo3.
Effect Demonstration
Method Demonstration

Given a reference image, audio sequence, and text prompt, this method generates a dynamic avatar from the front or different perspectives while maintaining identity consistency over longer periods of time. Additionally, it incorporates dynamic foreground and background elements, ensuring temporal consistency and high visual fidelity.
Method Overview

Audio Conditioning Strategy

- Self-Attention 
- Adaptive Normalization 
- Cross-Attention 
Identity Conditioning Strategy

- Facial Attention 
- Facial Adaptive Normalization 
- Identity Reference Network 
- Facial Attention and Identity Reference Network 
Scene
- Dynamic Scene 
- Diverse Head Poses 
- Portraits with Headwear 
- Portraits Interacting with Objects