
The Lumiere tool makes use of a pre-trained text-to-image diffusion model, which determines the location of each element in space and how they move in time. The AI helps to create realistic, clear and coherent movements based on written indications or existing images.
Google’s dedicated website and video demonstrate Lumiere’s ability to generate video sequences lasting of a few seconds simply from raw text (“Astronaut on the planet Mars making a detour around his base” or “a dog driving a car on a suburban street wearing funny sunglasses”) or images (making a character in a painting smile, a car roll, a fish swim, etc).
In fact, Lumiere can even create an animated sequence in a given style, provided it is given a reference image. It can also modify elements of an existing video, such as changing a person’s clothes or making an animal wear glasses.
Still in the experimental stage, Lumiere is the result of a collaboration between researchers at Google Research, the Weizmann Institute of Science and Tel Aviv University in Israel.