Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Alibaba’s Qwen Technology Lead Sets Up In-House Robot AI Team

Google DeepMind Releases Gemini 2.5 Computer Use Model

A busy week for OpenAI’s social video machine.

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Google DeepMind

Google’s new Gemini 2.5 model gives AI agents control over web and mobile interfaces

By Advanced AI EditorOctober 8, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Google DeepMind officially released the Gemini 2.5 Computer Use model in public preview, a specialised version of Gemini 2.5 Pro built to power AI agents that directly interact with graphical user interfaces (GUIs). This marks a significant move toward creating agents capable of performing complex digital tasks that previously required human-like interaction, tasks like filling out web forms, clicking buttons, and operating behind login screens.

The model is accessible to developers through the Gemini API via Google AI Studio and Vertex AI. Its core purpose is to let agents perform multi-step digital workflows on web browsers and, promisingly, on mobile applications.

How AI Agents Will Use Your Screen

While most AI models communicate with software through structured programming interfaces (APIs), a large number of real-world tasks still rely on human-facing UIs. The Gemini 2.5 Computer Use model attempts to bridge this gap.

The model operates in a continuous loop, mimicking a human user:

Input: The system sends the agent’s current task request, a screenshot of the computer screen (the environment), and a log of recent actions to the model.

Analysis and Action: The model analyses the inputs and returns a function call. This function call represents a specific UI action, such as “click at coordinates X, Y” or “type ‘text’ into a field.”

Execution and Feedback: Client-side code executes the action in the browser. Afterwards, the system captures a new screenshot and the current URL, sending this updated environment back to the model to restart the loop.

This process continues until the original task finishes, an error occurs, or a safety mechanism stops the agent.

Google DeepMind reports the model is currently optimised for web browsers but shows strong initial results for controlling mobile UIs. It does not yet offer full control over desktop operating systems.

Performance and Speed

Independent evaluations by Google and its partners, like Browserbase, indicate the Gemini 2.5 Computer Use model performs well compared to other current solutions.

Benchmarking results show the model achieves superior accuracy on multiple web and mobile control tests, including Online-Mind2Web and WebVoyager. Of particular note for developers, the model reportedly offers leading performance for browser control while maintaining low latency. A lower latency means agents can complete tasks faster, directly translating to better user experience and lower operational costs for business applications.

For instance, Poke.com, an early tester building an AI assistant, stated that the new model finished complex workflows up to 50% faster than other solutions they had considered.

Early Use Cases Emerge

The capability to automate GUI interaction has immediate, tangible business applications. Agents built on this model can manage complex data entry, conduct automated research across multiple websites, and manage user accounts.

Early applications already span several key areas:

UI Testing and Debugging: Google’s internal payments platform team is using the model as a contingency mechanism for end-to-end UI tests. The model can assess a failed test’s screen state and determine the correct actions to complete the workflow. This capability has successfully repaired over 60% of test execution failures that previously required manual developer attention. This saves significant development time and resources.

Workflow Automation: Companies like Autotab, which run fully autonomous agents, report that the model reliably parses context even in difficult scenarios, leading to an up to 18% performance increase on their hardest evaluations. This suggests a higher reliability for crucial tasks like data collection and processing.

Agentic Search and Assistance: Versions of the Computer Use model already power other Google products, including Project Mariner and specific agentic features in AI Mode in Search, showcasing its potential as a general-purpose digital assistant.

Addressing New Safety Risks

Agents that can control a computer interface introduce unique security risks, from potential malicious misuse to accidental, harmful actions like unwanted purchases. To manage these new risks, Google DeepMind built specific safety measures directly into the model’s structure, as detailed in the Gemini 2.5 Computer Use System Card.

Developers also receive controls to govern the agent’s behaviour:

Per-Step Safety Service: An out-of-model service evaluates every action the agent suggests before execution. This offers a final check against risky or harmful commands.

User Confirmation for High-Stakes Actions: The model can request user confirmation before performing sensitive actions, such as making a purchase.

System Instructions: Developers can use system instructions to block the agent from automatically completing actions deemed high-risk, like bypassing security measures, compromising a system’s integrity, or controlling critical infrastructure.

These layered defences are in place to help developers build agents that are both powerful and safe before deployment. Developers should thoroughly test their agents before launching to production.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleSlack is giving AI unprecedented access to your workplace conversations
Next Article EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark – Takara TLDR
Advanced AI Editor
  • Website

Related Posts

Google DeepMind Releases Gemini 2.5 Computer Use Model

October 8, 2025

The Nobel Prize in chemistry will be announced Wednesday

October 8, 2025

Google DeepMind tackles software vulnerabilities with AI agent

October 8, 2025

Comments are closed.

Latest Posts

Matthiesen Gallery Files Lawsuit Over Gustave Courbet Painting

MoMA Partners with Mattel for Van Gogh Barbie, Monet and Dalí Figures

Basquiat Work on Paper Headline’s Phillips’ Frieze Week Sales

Charges Against Isaac Wright ‘to Be Dropped’ After His Arrest by NYPD

Latest Posts

Alibaba’s Qwen Technology Lead Sets Up In-House Robot AI Team

October 8, 2025

Google DeepMind Releases Gemini 2.5 Computer Use Model

October 8, 2025

A busy week for OpenAI’s social video machine.

October 8, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Alibaba’s Qwen Technology Lead Sets Up In-House Robot AI Team
  • Google DeepMind Releases Gemini 2.5 Computer Use Model
  • A busy week for OpenAI’s social video machine.
  • Elon Musk Releases Free Video AI Model to Go Head – to
  • How to Use Search Live Feature, Real-Time Camera Search, and Support for 7 Indian Languages

Recent Comments

  1. کازئین چیست on Inside the Navy’s DoN GPT tool; Claude, Llama AI tools can now be used with sensitive data in Amazon’s government cloud
  2. WilliePaibe on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. Robertvof on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. WilliePaibe on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. lkjhKr on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.