Paper Page - Embodied Web Agents: Bridging Physical-Digital Realms For Integrated Agent Intelligence

AI agents today are mostly siloed – they either retrieve and reason over vast amount of digital information and knowledge obtained online; or interact with the physical world through embodied perception, planning and action – but rarely both. This separation limits their ability to solve tasks that require integrated physical and digital intelligence, such as cooking from online recipes, navigating with dynamic map data, or interpreting real-world landmarks using web knowledge. We introduce Embodied Web Agents, a novel paradigm for AI agents that fluidly bridge embodiment and web-scale reasoning. To operationalize this concept, we first develop the Embodied Web Agents task environments, a unified simulation platform that tightly integrates realistic 3D indoor and outdoor environments with functional web interfaces. Building upon this platform, we construct and release the Embodied Web Agents Benchmark, which encompasses a diverse suite of tasks including cooking, navigation, shopping, tourism, and geolocation – all requiring coordinated reasoning across physical and digital realms for systematic assessment of cross-domain intelligence. Experimental results reveal significant performance gaps between state-of-the-art AI systems and human capabilities, establishing both challenges and opportunities at the intersection of embodied cognition and web-scale knowledge access.

Source link

What's Hot

Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval – Takara TLDR

WIRED Roundup: The New Fake World of OpenAI’s Social Video App

IBM Adds Agentic AI to Network Intelligence

Paper page – Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval – Takara TLDR

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation – Takara TLDR

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition – Takara TLDR

Tomb of Amenhotep III Reopens After Two-Decade Renovation

Limited Edition Print of Ozzy Osbourne Art Sold To Benefit Charities

Odili Donald Odita Sues Jack Shainman Gallery over ‘Withheld’ Artworks

Morning Links for October 6, 2025

Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval – Takara TLDR

WIRED Roundup: The New Fake World of OpenAI’s Social Video App

IBM Adds Agentic AI to Network Intelligence

What's Hot

Paper page – Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Related Posts

Subscribe to Updates