arXiv AI

Towards Physical Intelligence World Models Via Unbounded Surface Evolution

By Advanced AI EditorJune 6, 2025No Comments2 Mins Read

[Submitted on 29 May 2025 (v1), last revised 5 Jun 2025 (this version, v2)]

View a PDF of the paper titled FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution, by Xiaoyi Liu and Hao Tang

View PDF
HTML (experimental)

Abstract:Physical intelligence — anticipating and shaping the world from partial, multisensory observations — is critical for next-generation world models. We propose FOLIAGE, a physics-informed multimodal world model for unbounded accretive surface growth. In its Action-Perception loop, a unified context encoder maps images, mesh connectivity, and point clouds to a shared latent state. A physics-aware predictor, conditioned on physical control actions, advances this latent state in time to align with the target latent of the surface, yielding a Modality-Agnostic Growth Embedding (MAGE) that interfaces with critic heads for downstream objectives. FOLIAGE’s Accretive Graph Network (AGN) captures dynamic connectivity through Age Positional Encoding and Energy-Gated Message-Passing. Geometry-Correspondence Fusion and Cross-Patch Masking enhance MAGE’s expressiveness, while Hierarchical Pooling balances global context with local dynamics. We create SURF-GARDEN, a world model learning platform comprising a Counterfactual Physics Simulator, a Multimodal Correspondence Extractor, and Evolution Tracing, which generates 7,200 diverse surface-growth sequences. SURF-BENCH, our physical-intelligence evaluation suite, evaluates six core tasks — topology recognition, inverse material estimation, growth-stage classification, latent roll-out, cross-modal retrieval, and dense correspondence — and four stress tests — sensor dropout, zero-shot modality transfer, long-horizon prediction, and physics ablation — to probe resilience. FOLIAGE outperforms specialized baselines while remaining robust across dynamic environments, establishing a new world-model based, multimodal pathway to physical intelligence.

Submission history

From: Xiaoyi Liu [view email]
[v1]
Thu, 29 May 2025 01:16:58 UTC (29,000 KB)
[v2]
Thu, 5 Jun 2025 02:00:09 UTC (29,000 KB)

Previous ArticleStanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation

Next Article AI-powered Relight and Search now available in Microsoft Photos

Advanced AI Editor

Leave A Reply