Wan 2.5 Text to Video
A text-and-image-to-video generation model that creates high-quality clips at 720p or 1080p with synchronized audio (and can generate audio when none is provided). It’s especially strong at stylized visuals—e.g., comic-book layouts with appealing color and consistent look—while maintaining generally solid cinematograph…
Speed
Price
Inputs
TextAudio
Outputs
Video
Lab
Alibaba Cloud
Wan 2.5 Text to Video sample outputs
Captured directly from our eval suite. Click any tile to inspect the full render.
More from the Wan family
Explore other options from the same family.