Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

C3.AI SHAREHOLDER ALERT: CLAIMSFILER REMINDS INVESTORS WITH LOSSES IN EXCESS OF $100,000 of Lead Plaintiff Deadline in Class Action Lawsuits Against C3.ai, Inc.

Tree Search for LLM Agent Reinforcement Learning – Takara TLDR

Business Insider Email Newsletters: Subscribe Now

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Google DeepMind

Gemini Robotics 1.5 enables agentic experiences, explains Google DeepMind

By Advanced AI EditorSeptember 26, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Three different robot embodiments that the Google DeepMind Gemini model works across.

Google DeepMind said its latest Gemini Robotics models can work across multiple robot embodiments. | Source: Google DeepMind

Google DeepMind yesterday introduced two models it claimed “unlock agentic experiences with advanced thinking” as a step toward artificial general intelligence, or AGI, for robots. Its new models are:

Gemini Robotics 1.5: DeepMind said this is its most capable vision-language-action (VLA) model yet. It can turn visual information and instructions into motor commands for a robot to perform a task. It also thinks before taking action and shows its process, enabling robots to assess and complete complex tasks more transparently. The model also learns across embodiments, accelerating skill learning.
Gemini Robotics-ER 1.5: The company said this is its most capable vision-language model (VLM). It reasons about the physical world, natively calls digital tools, and creates detailed, multi-step plans to complete a mission. DeepMind said it now achieves state-of-the-art performance across spatial understanding benchmarks.

DeepMind is making Gemini Robotics-ER 1.5 available to developers via the Gemini application programming interface (API) in Google AI Studio. Gemini Robotics 1.5 is currently available to select partners.

The company asserted that the releases mark an important milestone toward solving AGI in the physical world. By introducing agentic capabilities, Google said it is moving beyond AI models that react to commands and creating systems that can reason, plan, actively use tools, and generalize.

DeepMind designs agentic experiences for physical tasks

Most daily tasks require contextual information and multiple steps to complete, making them notoriously challenging for robots today. That’s why DeepMind designed these two models to work together in an agentic framework.

Gemini Robotics-ER 1.5 orchestrates a robot’s activities, like a high-level brain. DeepMind said this model excels at planning and making logical decisions within physical environments. It has state-of-the-art spatial understanding, interacts in natural language, estimates its success and progress, and can natively call tools like Google Search to look for information or use any third-party user-defined functions.

The VLM gives Gemini Robotics 1.5 natural language instructions for each step, which use its vision and language understanding to directly perform the specific actions. Gemini Robotics 1.5 also helps the robot think about its actions to better solve semantically complex tasks, and can even explain its thinking processes in natural language — making its decisions more transparent.

Both of these models are built on the core Gemini family of models and have been fine-tuned with different datasets to specialize in their respective roles. When combined, they increase the robot’s ability to generalize to longer tasks and more diverse environments, said DeepMind.

Robots can understand environments and think before acting

Gemini Robotics-ER 1.5 is a thinking model optimized for embodied reasoning, said Google DeepMind. The company claimed it “achieves state-of-the-art performance on both academic and internal benchmarks, inspired by real-world use cases from our trusted tester program.”

DeepMind evaluated Gemini Robotics-ER 1.5 on 15 academic benchmarks, including Embodied Reasoning Question Answering (ERQA) and Point-Bench, measuring the model’s performance on pointing, image question answering, and video question answering.

VLA models traditionally translate instructions or linguistic plans directly into a robot’s movement. Gemini Robotics 1.5 goes a step further, allowing a robot to think before taking action, said DeepMind. This means it can generate an internal sequence of reasoning and analysis in natural language to perform tasks that require multiple steps or require a deeper semantic understanding.

“For example, when completing a task like, ‘Sort my laundry by color,’ the robot in the video below thinks at different levels,” wrote DeepMind. “First, it understands that sorting by color means putting the white clothes in the white bin and other colors in the black bin. Then it thinks about steps to take, like picking up the red sweater and putting it in the black bin, and about the detailed motion involved, like moving a sweater closer to pick it up more easily.”

During a multi-level thinking process, the VLA model can decide to turn longer tasks into simpler, shorter segments that the robot can execute successfully. It also helps the model generalize to solve new tasks and be more robust to changes in its environment.

Gemini learns across embodiments

Robots come in all shapes and sizes, and they have different sensing capabilities and different degrees of freedom, making it difficult to transfer motions learned from one robot to another.

DeepMind said Gemini Robotics 1.5 shows a remarkable ability to learn across different embodiments. It can transfer motions learned from one robot to another, without needing to specialize the model to each new embodiment. This accelerates learning new behaviors, helping robots become smarter and more useful.

For example, DeepMind observed that tasks only presented to the ALOHA 2 robot during training, also just work on Apptronik’s humanoid robot, Apollo, and the bi-arm Franka robot, and vice versa.

DeepMind said Gemini Robotics 1.5 implements a holistic approach to safety through high-level semantic reasoning, including thinking about safety before acting, ensuring respectful dialogue with humans via alignment with existing Gemini Safety Policies, and triggering low-level safety sub-systems (e.g. for collision avoidance) on-board the robot when needed.

To guide our safe development of Gemini Robotics models, DeepMind is also releasing an upgrade of the ASIMOV benchmark, a comprehensive collection of datasets for evaluating and improving semantic safety, with better tail coverage, improved annotations, new safety question types, and new video modalities. In its safety evaluations on the ASIMOV benchmark, Gemini Robotics-ER 1.5 shows state-of-the-art performance, and its thinking ability significantly contributes to the improved understanding of semantic safety and better adherence to physical safety constraints.

Editor’s note: RoboBusiness 2025, which will be on Oct. 15 and 16 in Santa Clara, Calif., will include tracks on physical AI and humanoid robots. Registration is now open.


SITE AD for the 2025 RoboBusiness registration open.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleElon Musk’s xAI Drags OpenAI to Court Over Alleged Trade Secret Theft
Next Article Anthropic’s Global Expansion Plan to Meet Surging Demand for Claude AI
Advanced AI Editor
  • Website

Related Posts

Google DeepMind adds agentic AI models to robots

September 26, 2025

Google DeepMind Empowers Robots with Smarter AI and Web Access for Real-World Problem-Solving

September 26, 2025

Google DeepMind unveils new robotics AI model that can sort laundry

September 26, 2025

Comments are closed.

Latest Posts

Lisa Phillips, Longtime Director of New York’s New Museum, to Retire

Submerged Port Discovery Offers Clues to Lost Tomb of Cleopatra

Forged Polish Painting Returns to the National Museum in Poznań

French Artist Invader Sues Julien Auctions Over Sale of Street Artworks

Latest Posts

C3.AI SHAREHOLDER ALERT: CLAIMSFILER REMINDS INVESTORS WITH LOSSES IN EXCESS OF $100,000 of Lead Plaintiff Deadline in Class Action Lawsuits Against C3.ai, Inc.

September 26, 2025

Tree Search for LLM Agent Reinforcement Learning – Takara TLDR

September 26, 2025

Business Insider Email Newsletters: Subscribe Now

September 26, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • C3.AI SHAREHOLDER ALERT: CLAIMSFILER REMINDS INVESTORS WITH LOSSES IN EXCESS OF $100,000 of Lead Plaintiff Deadline in Class Action Lawsuits Against C3.ai, Inc.
  • Tree Search for LLM Agent Reinforcement Learning – Takara TLDR
  • Business Insider Email Newsletters: Subscribe Now
  • MIT Study Finds Chatbot Love Is Real—and It’s Often Unintentional
  • What’s behind the massive AI data center headlines?

Recent Comments

  1. phlebolog-263 on Anthropic’s popular Claude Code AI tool now included in its $20/month Pro plan
  2. zaimnakartu-546 on Chinese Firms Have Placed $16B in Orders for Nvidia’s (NVDA) H20 AI Chips
  3. phlebolog-947 on Nebius Stock Soars on $1B AI Funding, Analyst Sees 75% Upside
  4. zaimnakartu-880 on Anthropic’s popular Claude Code AI tool now included in its $20/month Pro plan
  5. zaimnakartu-640 on Nebius Stock Soars on $1B AI Funding, Analyst Sees 75% Upside

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.