EchoMimicV3: How a 1.3B-Parameter Model Masters Multi-Modal Human Animation

7 hours ago 高效码农

tags: – EchoMimicV3 – 1.3B – Soup-of-Tasks – Soup-of-Modals – CDCA – PhDA – Negative DPO – PNG – Long Video CFG – Wan2.1-FUN EchoMimicV3 — How a 1.3B-parameter Model Unifies Multi-Modal, Multi-Task Human Animation Intro (what you’ll learn in a few lines) This post explains, using only the provided project README and paper, how EchoMimicV3 is designed and implemented to produce multi-modal, multi-task human animation with a compact 1.3B-parameter model. You’ll get a clear view of the problem framing, the core building blocks (Soup-of-Tasks, Soup-of-Modals / CDCA, PhDA), the training and inference strategies (Negative DPO, PNG, Long Video CFG), …