advertisement
Facebook
X
LinkedIn
WhatsApp
Reddit

StableDiffusion launches generative AI video maker

  • The company behind image generation technology StableDiffusion, Stability.AI has announced new generative models that can create short video clips from text prompts.
  • Called Stable Video Diffusion, the company says that its models outperform the current text-to-video generators on the market.
  • Stable Video Diffusion can generate dynamic, but short clips and will struggle with generating realistic humans.

AI-generated images are becoming increasingly popular online due to how easily and quickly this type of media can be spun up thanks to free image generation technologies like Dall.E, Midjourney, and StableDiffusion.

Pushing beyond, the companies whose technology is used to create this media are now slowly moving towards AI-generated videos.

This makes sense, as a video – like motion pictures when they first emerged – are a series of images, or frames, interlaced together to create the illusion of motion.

Why not have these frames also be created by generative AI? On Tuesday, Stability.AI, the creators of StableDiffusion, announced the launch of Stable Video Diffusion.

A “video diffusion model for high resolution, state of the art text to video and image to video generation.” It turns out the company is simply transforming its already-existing image-generating “diffusion” models to video generation by “inserting temporal layers.”

This is basically the same way that movies work, by having the images play over each other on a timed basis. Other 2D image models can be trained to do this, but the final products are usually spotty at best.

Stability.AI believes their own new video models, two of which it is releasing, already beat existing text to video models like Runway.

“At the time of release in their foundational form, through external evaluation, we have found these models surpass the leading closed models in user preference studies,” the company has said.

It showed off a teaser of the potential of the new models in a 20-second archived YouTube video, which we have embedded below:

The short clips are clearly high quality, and unlike rival models, they have some dynamism to them – such as the clip with the two blue jays. Usually, when text-to-video models are tasked with a lot of motion they lose some of the image fidelity.

However, Stability.AI did indicate that there are some limitations to Stable Video Diffusion right now. The first is that it can only generate clips that are less than four seconds at a time, it cannot generate photorealistic clips, it can’t do any camera motions other than slow pans, it has no text control in that it cannot generate legible text in the clips and people and human faces are also out of the question.

Basically similar issues sprang up when text-to-image generation models like StableDiffusion were first introduced. The good news for Stability.AI and perhaps the bad news for peddlers of stock footage, is that these issues will gradually be ironed out as the models become more advanced. Sort of like how the latest image generation models are starting to get human hands right.

Stability.AI says that the tool was trained on “millions” of video clips that were publically available for research purposes. Training the AI on other sources may lead to better quality clips, but could embroil the company in hot water over copyright issues.

As Engadget points out, Stability.AI was sued by Getty Images in January over allegations that the company illegally accessed Getty’s image data to train its AI.

Content creators would be among the first to use the technology to generate filler clips in YouTube videos, for example. With many already using AI-generated images on video thumbnails.

This of course bodes ill for designers and digital artists who continue to battle against AI-generated content. Probably to no avail.

You can sign up right now to test the latest tech from Stability.AI, such as Stable Video Diffusion, here.

advertisement

About Author

advertisement

Related News

advertisement