Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Claude AI can recall past chats only when you ask

A New Robot from Google DeepMind Can Beat Humans at Table Tennis

Elon Musk Accuses Apple of Favoring OpenAI, Threatens Antitrust Lawsuit Over App Store Rankings

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
VentureBeat AI

Study warns of security risks as ‘OS agents’ gain control of computers and phones

By Advanced AI EditorAugust 12, 2025No Comments7 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

Researchers have published the most comprehensive survey to date of so-called “OS Agents” — artificial intelligence systems that can autonomously control computers, mobile phones and web browsers by directly interacting with their interfaces. The 30-page academic review, accepted for publication at the prestigious Association for Computational Linguistics conference, maps a rapidly evolving field that has attracted billions in investment from major technology companies.

“The dream to create AI assistants as capable and versatile as the fictional J.A.R.V.I.S from Iron Man has long captivated imaginations,” the researchers write. “With the evolution of (multimodal) large language models ((M)LLMs), this dream is closer to reality.”

The survey, led by researchers from Zhejiang University and OPPO AI Center, comes as major technology companies race to deploy AI agents that can perform complex digital tasks. OpenAI recently launched “Operator,” Anthropic released “Computer Use,” Apple introduced enhanced AI capabilities in “Apple Intelligence,” and Google unveiled “Project Mariner” — all systems designed to automate computer interactions.

OS agents work by observing computer screens and system data, then executing actions like clicks and swipes across mobile, desktop and web platforms. The systems must understand interfaces, plan multi-step tasks and translate those plans into executable code. (Credit: GitHub)

Tech giants rush to deploy AI that controls your desktop

The speed at which academic research has transformed into consumer-ready products is unprecedented, even by Silicon Valley standards. The survey reveals a research explosion: over 60 foundation models and 50 agent frameworks developed specifically for computer control, with publication rates accelerating dramatically since 2023.

AI Scaling Hits Its Limits

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

Turning energy into a strategic advantage

Architecting efficient inference for real throughput gains

Unlocking competitive ROI with sustainable AI systems

Secure your spot to stay ahead: https://bit.ly/4mwGngO

This isn’t just incremental progress. We’re witnessing the emergence of AI systems that can genuinely understand and manipulate the digital world the way humans do. Current systems work by taking screenshots of computer screens, using advanced computer vision to understand what’s displayed, then executing precise actions like clicking buttons, filling forms, and navigating between applications.

“OS Agents can complete tasks autonomously and have the potential to significantly enhance the lives of billions of users worldwide,” the researchers note. “Imagine a world where tasks such as online shopping, travel arrangements booking, and other daily activities could be seamlessly performed by these agents.”

The most sophisticated systems can handle complex multi-step workflows that span different applications — booking a restaurant reservation, then automatically adding it to your calendar, then setting a reminder to leave early for traffic. What took humans minutes of clicking and typing can now happen in seconds, without human intervention.

The development of AI agents requires a complex training pipeline that combines multiple approaches, from initial pre-training on screen data to reinforcement learning that optimizes performance through trial and error. (Credit: arxiv.org)

Why security experts are sounding alarms about AI-controlled corporate systems

For enterprise technology leaders, the promise of productivity gains comes with a sobering reality: these systems represent an entirely new attack surface that most organizations aren’t prepared to defend.

The researchers dedicate substantial attention to what they diplomatically term “safety and privacy” concerns, but the implications are more alarming than their academic language suggests. “OS Agents are confronted with these risks, especially considering its wide applications on personal devices with user data,” they write.

The attack methods they document read like a cybersecurity nightmare. “Web Indirect Prompt Injection” allows malicious actors to embed hidden instructions in web pages that can hijack an AI agent’s behavior. Even more concerning are “environmental injection attacks” where seemingly innocuous web content can trick agents into stealing user data or performing unauthorized actions.

Consider the implications: an AI agent with access to your corporate email, financial systems, and customer databases could be manipulated by a carefully crafted web page to exfiltrate sensitive information. Traditional security models, built around human users who can spot obvious phishing attempts, break down when the “user” is an AI system that processes information differently.

The survey reveals a concerning gap in preparedness. While general security frameworks exist for AI agents, “studies on defenses specific to OS Agents remain limited.” This isn’t just an academic concern — it’s an immediate challenge for any organization considering deployment of these systems.

The reality check: Current AI agents still struggle with complex digital tasks

Despite the hype surrounding these systems, the survey’s analysis of performance benchmarks reveals significant limitations that temper expectations for immediate widespread adoption.

Success rates vary dramatically across different tasks and platforms. Some commercial systems achieve success rates above 50% on certain benchmarks — impressive for a nascent technology — but struggle with others. The researchers categorize evaluation tasks into three types: basic “GUI grounding” (understanding interface elements), “information retrieval” (finding and extracting data), and complex “agentic tasks” (multi-step autonomous operations).

The pattern is telling: current systems excel at simple, well-defined tasks but falter when faced with the kind of complex, context-dependent workflows that define much of modern knowledge work. They can reliably click a specific button or fill out a standard form, but struggle with tasks that require sustained reasoning or adaptation to unexpected interface changes.

This performance gap explains why early deployments focus on narrow, high-volume tasks rather than general-purpose automation. The technology isn’t yet ready to replace human judgment in complex scenarios, but it’s increasingly capable of handling routine digital busywork.

OS agents rely on interconnected systems for perception, planning, memory and action execution. The complexity of coordinating these components helps explain why current systems still struggle with sophisticated tasks. (Credit: arxiv.org)

What happens when AI agents learn to customize themselves for every user

Perhaps the most intriguing — and potentially transformative — challenge identified in the survey involves what researchers call “personalization and self-evolution.” Unlike today’s stateless AI assistants that treat every interaction as independent, future OS agents will need to learn from user interactions and adapt to individual preferences over time.

“Developing personalized OS Agents has been a long-standing goal in AI research,” the authors write. “A personal assistant is expected to continuously adapt and provide enhanced experiences based on individual user preferences.”

This capability could fundamentally change how we interact with technology. Imagine an AI agent that learns your email writing style, understands your calendar preferences, knows which restaurants you prefer, and can make increasingly sophisticated decisions on your behalf. The potential productivity gains are enormous, but so are the privacy implications.

The technical challenges are substantial. The survey points to the need for better multimodal memory systems that can handle not just text but images and voice, presenting “significant challenges” for current technology. How do you build a system that remembers your preferences without creating a comprehensive surveillance record of your digital life?

For technology executives evaluating these systems, this personalization challenge represents both the greatest opportunity and the largest risk. The organizations that solve it first will gain significant competitive advantages, but the privacy and security implications could be severe if handled poorly.

The race to build AI assistants that can truly operate like human users is intensifying rapidly. While fundamental challenges around security, reliability, and personalization remain unsolved, the trajectory is clear. The researchers maintain an open-source repository tracking developments, acknowledging that “OS Agents are still in their early stages of development” with “rapid advancements that continue to introduce novel methodologies and applications.”

The question isn’t whether AI agents will transform how we interact with computers — it’s whether we’ll be ready for the consequences when they do. The window for getting the security and privacy frameworks right is narrowing as quickly as the technology is advancing.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleSeoul-based Datumo raises $15.5M to take on Scale AI, backed by Salesforce
Next Article Where Can I Buy Kratom Near Me? Mit Therapy’s Tips For Smart Purchases
Advanced AI Editor
  • Website

Related Posts

TD Securities taps Layer 6 and OpenAI to deliver real-time equity insights to sales and trading teams

August 11, 2025

OpenAI is editing its GPT-5 rollout on the fly

August 11, 2025

AI’s promise of opportunity masks a reality of managed displacement

August 10, 2025

Comments are closed.

Latest Posts

Midjourney Slams Lawsuit Filed by Disney to Prevent AI Training

Smithsonian Updates Museum Display on Impeachment To Include Trump

Funder Tried to Hijack Kandinsky Art Theft Suits, Says Collector

Historic Ukrainian Synagogue Damaged by Russian Drone Strike

Latest Posts

Claude AI can recall past chats only when you ask

August 12, 2025

A New Robot from Google DeepMind Can Beat Humans at Table Tennis

August 12, 2025

Elon Musk Accuses Apple of Favoring OpenAI, Threatens Antitrust Lawsuit Over App Store Rankings

August 12, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Claude AI can recall past chats only when you ask
  • A New Robot from Google DeepMind Can Beat Humans at Table Tennis
  • Elon Musk Accuses Apple of Favoring OpenAI, Threatens Antitrust Lawsuit Over App Store Rankings
  • Does using AI dumb you down?
  • HPE Expands NVIDIA AI Enterprise Integration with Blackwell GPU Solutions

Recent Comments

  1. EdwardEnror on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. ThomasWep on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. ThomasWep on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. EdwardEnror on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. ThomasWep on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.