We’re powering an era of physical agents with Gemini Robotics 1.5 — enabling robots to perceive, plan, think, use tools and act to better solve complex, multi-step tasks.
🤖 Gemini Robotics 1.5 is our most capable vision-language-action (VLA) model that turns visual information and instructions into motor commands for a robot to perform a task. This model thinks before taking action and shows its process, helping robots assess and complete complex tasks more transparently. It also learns across embodiments, accelerating skill learning.
🤖 Gemini Robotics-ER 1.5 is our most capable vision-language model (VLM) that reasons about the physical world, natively calls digital tools and creates detailed, multi-step plans to complete a mission. This model now achieves state-of-the-art performance across spatial understanding benchmarks.
We’re making Gemini Robotics-ER 1.5 available to developers via the Gemini API in Google AI Studio and Gemini Robotics 1.5 to select partners.
Learn more:
——
Subscribe to our channel
Find us on X
Follow us on Instagram
Add us on Linkedin
source