Chinese tech giant Tencent has revealed the latest version of its open-source video generation model, DynamiCrafter, on GitHub. Leveraging the diffusion method inspired by physics, the AI tool transforms captions and still images into seconds-long videos.
The second generation of the model produces videos at an improved pixel resolution of 640×1024, showcasing its advancement since its initial release in October.
The key differentiator for DynamiCrafter, according to an academic paper by its developers, lies in broadening the applicability of image animation techniques to a more general visual content spectrum. The model incorporates the motion prior of text-to-video diffusion by using the image as guidance, setting it apart from traditional techniques that focus on specific domains like natural scenes or human motions.
DynamiCrafter vs Stable Diffusion
In a comparison demo with Stable Video Diffusion and Pika Labs, DynamiCrafter exhibits slightly more animated results, emphasizing its potential in the evolving field of generative videos. While none of the models, including DynamiCrafter, currently produce full-fledged movies, generative videos are gaining attention in the AI landscape after the success of generative text and images.
DynamiCrafter
— AK (@_akhaliq) February 5, 2024
Demo: https://t.co/im9Jb6xH3y
model: https://t.co/jvp6qku3MN
Animating Open-domain Images with Video Diffusion Priors pic.twitter.com/sq3x3SMa5t
Tencent joins other Chinese tech giants like ByteDance, Baidu, and Alibaba in advancing AI video generation. ByteDance’s MagicVideo and Baidu’s UniVG have posted demos on GitHub, while Alibaba has open-sourced its video generation model VGen.
This move aligns with the global trend of Chinese tech firms sharing AI technologies with the broader developer community. The continued focus on AI video generation indicates its potential as a significant frontier in the ongoing AI race.