Paper page - Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

A new dataset and evaluation framework improve zero-shot text-to-motion generation through a large-scale, high-quality dataset and a scalable model architecture.

Generating diverse and natural human motion sequences based on textual
descriptions constitutes a fundamental and challenging research area within the
domains of computer vision, graphics, and robotics. Despite significant
advancements in this field, current methodologies often face challenges
regarding zero-shot generalization capabilities, largely attributable to the
limited size of training datasets. Moreover, the lack of a comprehensive
evaluation framework impedes the advancement of this task by failing to
identify directions for improvement. In this work, we aim to push
text-to-motion into a new era, that is, to achieve the generalization ability
of zero-shot. To this end, firstly, we develop an efficient annotation pipeline
and introduce MotionMillion-the largest human motion dataset to date, featuring
over 2,000 hours and 2 million high-quality motion sequences. Additionally, we
propose MotionMillion-Eval, the most comprehensive benchmark for evaluating
zero-shot motion generation. Leveraging a scalable architecture, we scale our
model to 7B parameters and validate its performance on MotionMillion-Eval. Our
results demonstrate strong generalization to out-of-domain and complex
compositional motions, marking a significant step toward zero-shot human motion
generation. The code is available at
https://github.com/VankouF/MotionMillion-Codes.

Source link

What's Hot

TU Wien Rendering #8 – Surface Normals

Paper page – Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Long-running execution flows now supported in Amazon Bedrock Flows in public preview

Paper page – Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Paper page – Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Paper page – PyVision: Agentic Vision with Dynamic Tooling

Paper page – Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Homeland Security Targets Chicago’s National Museum of Puerto Rican Arts & Culture

1,600-Year-Old Tomb of Mayan City’s Founding King Discovered in Belize

Centre Pompidou Cancels Caribbean Art Show, Raising Controversy

‘Night at the Museum’ Reboot in the Works

TU Wien Rendering #8 – Surface Normals

Paper page – Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Long-running execution flows now supported in Amazon Bedrock Flows in public preview

What's Hot

Paper page – Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Related Posts

Subscribe to Updates