It's cool to see how DeepMind works after it demonstrated that it can dream up short videos from a single frame.
The artificial intelligence model named "Transframer" is similar to a "transformer" that whips up text based on partial prompts.
Transframer is a general-purpose generative framework that can handle many image and video tasks in a probabilistic setting. New work shows it excels in video prediction and view synthesis, and can generate 30s videos from a single image: https://t.co/wX3nrrYEEa 1/ pic.twitter.com/gQk6f9nZyg
— DeepMind (@DeepMind) August 15, 2022
The Transframer website states that the artificial intelligence makes its perspective videos by predicting the target images' surroundings with "context images" and then guessing what one of the chairs below would look like.
The model seems to be able to apply artificial depth perception and perspective to generate what the image would look like if someone were to move around it, raising the possibility of entire video games based on machine learning tech instead of traditional rendering.
An example of the kind of AI-on-ai action we'll likely be seeing a lot more is the plan to use Transframer in conjunction with outputs from Openai's DALL-E image generating algorithm.
DeepMind has a GIFs.
The Arbitrary Frame Prediction with Generative Models is part of the Trans framer.