> But it's not a game. It's a memory of a game video, predicting the next frame based on the few previous frames, like "I can imagine what happened next".
It's not super clear from the landing page, but I think it's an engine? Like, its input is both previous images and input for the next frame.
So as a player, if you press "shoot", the diffusion engine need to output an image where the monster in front of you takes damage/dies.
We present GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality.
It's not super clear from the landing page, but I think it's an engine? Like, its input is both previous images and input for the next frame.
So as a player, if you press "shoot", the diffusion engine need to output an image where the monster in front of you takes damage/dies.