Last December we wrote about an intriguing startup called World Labs that could generate 3D worlds from single images. But bigger fish are playing in that space too: Google’s DeepMind AI division has just unveiled its latest “general purpose world model” which is called Genie 3.
“Given a text prompt, Genie 3 can generate dynamic worlds that you can navigate in real time at 24 frames per second, retaining consistency for a few minutes at a resolution of 720p,” it explained in its announcement. Consistency meaning that objects, scenery etc stay in place even after you look away from them. There are clear uses for gaming here, but also education and… perhaps music? It would be interesting to see what creative artists can do with this kind of technology.
However, the long-term plan for DeepMind for these ‘world models’ is much grander: “a key stepping stone on the path to AGI, since they make it possible to train AI agents in an unlimited curriculum of rich simulation environments”. AGI being artificial general intelligence: where an AI can understand or learn a range of tasks that humans can.