Wan mark

Wan 2.5 Text to Video

A text-and-image-to-video generation model that creates high-quality clips at 720p or 1080p with synchronized audio (and can generate audio when none is provided). It’s especially strong at stylized visuals—e.g., comic-book layouts with appealing color and consistent look—while maintaining generally solid cinematograph…

Speed

Price

Inputs

TextAudio

Outputs

Video

Lab

Alibaba Cloud

Wan 2.5 Text to Video sample outputs

Captured directly from our eval suite. Click any tile to inspect the full render.

More from the Wan family

Explore other options from the same family.

Launch your first agent today

Build without code. Share without limits.