Google’s new AI video generator, Lumiere, uses a new model called Space-Time-U-Net, or STUNet, that figures out where things are in a video and how they move and change simultaneously. This method allows Lumiere to create a video in one process, rather than stitching together smaller frames.
Lumiere starts by creating a base frame from a text command. It then uses the STUNet framework to determine where the objects in that frame are moving to create more frames that flow into each other, giving the impression of smooth motion.
Source: The Verge