arXiv AI

[2505.07096] X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real

By Advanced AI EditorMay 16, 2025No Comments2 Mins Read

[Submitted on 11 May 2025 (v1), last revised 15 May 2025 (this version, v2)]

View a PDF of the paper titled X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real, by Prithwish Dan and Kushal Kedia and Angela Chao and Edward Weiyi Duan and Maximus Adrian Pace and Wei-Chiu Ma and Sanjiban Choudhury

View PDF
HTML (experimental)

Abstract:Human videos offer a scalable way to train robot manipulation policies, but lack the action labels needed by standard imitation learning algorithms. Existing cross-embodiment approaches try to map human motion to robot actions, but often fail when the embodiments differ significantly. We propose X-Sim, a real-to-sim-to-real framework that uses object motion as a dense and transferable signal for learning robot policies. X-Sim starts by reconstructing a photorealistic simulation from an RGBD human video and tracking object trajectories to define object-centric rewards. These rewards are used to train a reinforcement learning (RL) policy in simulation. The learned policy is then distilled into an image-conditioned diffusion policy using synthetic rollouts rendered with varied viewpoints and lighting. To transfer to the real world, X-Sim introduces an online domain adaptation technique that aligns real and simulated observations during deployment. Importantly, X-Sim does not require any robot teleoperation data. We evaluate it across 5 manipulation tasks in 2 environments and show that it: (1) improves task progress by 30% on average over hand-tracking and sim-to-real baselines, (2) matches behavior cloning with 10x less data collection time, and (3) generalizes to new camera viewpoints and test-time changes. Code and videos are available at this https URL.

Submission history

From: Prithwish Dan [view email]
[v1]
Sun, 11 May 2025 19:04:00 UTC (9,720 KB)
[v2]
Thu, 15 May 2025 00:43:19 UTC (9,720 KB)

Previous ArticleHeyGen CTO Rong Yan on AI Video Generation and the Language Challenge

Next Article ChatGPT Used for Medical Advice, OpenAI Launches HealthBench Tool

Advanced AI Editor

Leave A Reply