Wan2.2 in Plain English
A complete, no-jargon guide to installing, downloading, and running the newest open-source video-generation model
“
Who this is for
Junior-college graduates, indie creators, junior developers, and anyone who wants to turn text or images into 720 p, 24 fps videos on their own hardware or cloud instance.
No PhD required.
1. Three facts you need to know first
2. The four upgrades that matter
3. Installation: three proven paths
“
Pick one and move on.
If you already use Python daily, the pip route is fastest.
If you like reproducible environments, use Poetry.
Ifflash-attn
refuses to compile, see the troubleshooting table in section 3.3.
3.1 pip (universal)
git clone https://github.com/Wan-Video/Wan2.2.git
cd Wan2.2
# 1. Make sure torch ≥ 2.4.0 is installed first
pip install -r requirements.txt
# 2. If flash_attn fails, install it last with the flag below
pip install flash-attn --no-build-isolation
3.2 Poetry (fully locked)
# 0. Install Poetry once
curl -sSL https://install.python-poetry.org | python3 -
# 1. Install every dependency
poetry install
# 2. If flash-attn still errors
poetry run pip install --upgrade pip setuptools wheel
poetry run pip install flash-attn --no-build-isolation
poetry install # re-sync lock file
3.3 Common errors and quick fixes
4. Downloading the weights
All checkpoints live in two official mirrors. Choose the one closest to you.
4.1 Hugging Face CLI example
pip install "huggingface_hub[cli]"
huggingface-cli download Wan-AI/Wan2.2-T2V-A14B --local-dir ./Wan2.2-T2V-A14B
4.2 ModelScope CLI example
pip install modelscope
modelscope download Wan-AI/Wan2.2-T2V-A14B --local_dir ./Wan2.2-T2V-A14B
5. First run: three starter commands
Replace the prompts with your own text or image path.
5.1 Text-to-Video (T2V-A14B)
Single GPU (needs 80 GB)
python generate.py \
--task t2v-A14B \
--size 1280*720 \
--ckpt_dir ./Wan2.2-T2V-A14B \
--offload_model True \
--convert_model_dtype \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
8-GPU multi-node (FSDP + DeepSpeed Ulysses)
torchrun --nproc_per_node=8 generate.py \
--task t2v-A14B \
--size 1280*720 \
--ckpt_dir ./Wan2.2-T2V-A14B \
--dit_fsdp --t5_fsdp --ulysses_size 8 \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
5.2 Image-to-Video (I2V-A14B)
python generate.py \
--task i2v-A14B \
--size 1280*720 \
--ckpt_dir ./Wan2.2-I2V-A14B \
--image examples/i2v_input.JPG \
--offload_model True \
--convert_model_dtype \
--prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard..."
5.3 Text-and-Image-to-Video (TI2V-5B)
“
Runs on 24 GB VRAM (e.g., RTX 4090)
python generate.py \
--task ti2v-5B \
--size 1280*704 \
--ckpt_dir ./Wan2.2-TI2V-5B \
--offload_model True \
--convert_model_dtype \
--t5_cpu \
--image examples/i2v_input.JPG \
--prompt "Summer beach vacation style..."
6. Prompt extension (optional but useful)
If you prefer not to craft long prompts yourself, let a large-language-model expand them for you.
7. Performance snapshot on common GPUs
“
Format: total time (s) / peak GPU memory (GB)
Settings: multi-GPU uses FSDP + Ulysses; single-GPU uses offloading and dtype conversion.
8. Troubleshooting FAQ
Q1: How long can the generated clips be?
- •
TI2V-5B: default 5 s @ 24 fps. - •
A14B: 5–8 s at 720 p, longer at 480 p.
Q2: How do I avoid out-of-memory errors?
- •
Add --offload_model True
- •
Add --convert_model_dtype
(fp16/bf16) - •
Move the text encoder to CPU: --t5_cpu
Q3: Where are the output videos saved?
- •
outputs/
with a timestamped sub-folder.
Q4: Does it work on Windows?
- •
Yes. Install PyTorch with CUDA first; the remaining steps are identical.
9. Developer extras
- •
Code formatting
black . isort .
- •
Run unit tests
bash tests/test.sh
- •
Ready-made integrations
10. Citation
If this guide or the model helps your work, please cite:
@article{wan2025,
title={Wan: Open and Advanced Large-Scale Video Generative Models},
author={Team Wan and others},
journal={arXiv preprint arXiv:2503.20314},
year={2025}
}
11. License and usage responsibility
- •
Code & weights: Apache 2.0 - •
Generated content: You own it, but you must comply with local laws. - •
Full legal text: see LICENSE.txt
in the repository root.
Happy creating!