Key Points
- Genie 3 is a foundation world model that can train general-purpose AI agents
- It can generate interactive 3D environments and features promptable world events
- Genie 3’s simulations stay physically consistent over time
- The model can remember what it previously generated and reason over long time horizons
- Genie 3 has implications for educational experiences, gaming, and prototyping creative concepts
- It can push AI agents to their limits, forcing them to learn from their own experience
Google DeepMind has revealed Genie 3, its latest foundation world model that can be used to train general-purpose AI agents, a capability that the AI lab says makes for a crucial stepping stone on the path to “artificial general intelligence,” or human-like intelligence.
Genie 3 is the first real-time interactive general-purpose world model, according to Shlomi Fruchter, a research director at DeepMind. It goes beyond narrow world models that existed before and can generate both photo-realistic and imaginary worlds.
Key Features of Genie 3
Genie 3 builds on both its predecessor Genie 2 and DeepMind’s latest video generation model Veo 3. With a simple text prompt, Genie 3 can generate multiple minutes of interactive 3D environments at 720p resolution at 24 frames per second.
The model also features “promptable world events,” or the ability to use a prompt to change the generated world. Perhaps most importantly, Genie 3’s simulations stay physically consistent over time because the model can remember what it previously generated.
Fruchter said that while Genie 3 has implications for educational experiences, gaming, or prototyping creative concepts, its real unlock will manifest in training agents for general-purpose tasks, which he said is essential to reaching AGI.
Training Agents for General-Purpose Tasks
Genie 3 is supposedly designed to solve the bottleneck of training agents for general-purpose tasks. Like Veo, it doesn’t rely on a hard-coded physics engine; instead, DeepMind says, the model teaches itself how the world works by remembering what it has generated and reasoning over long time horizons.
The model is auto-regressive, meaning it generates one frame at a time, and has to look back at what was generated before to decide what’s going to happen next. That memory lends to consistency in Genie 3’s simulated worlds, which in turn allows it to develop a grasp of physics.
Notably, DeepMind says the model also has the potential to push AI agents to their limits — forcing them to learn from their own experience, similar to how humans learn in the real world.
Source: techcrunch.com