Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

After India, OpenAI launches its affordable ChatGPT Go plan in Indonesia

What Actually Works in 2025?

OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System – Takara TLDR

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Stability AI

Unlocking the 3D World with 2D Data: The First Multi-View Video Diffusion Framework for Kinematic Part Segmentation

By Advanced AI EditorSeptember 23, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Hao Zhang, a PhD student at the University of Illinois at Urbana-Champaign (UIUC), focuses on 3D/4D reconstruction, generative modeling, and physics-driven animation. He is currently a research intern at Snap Inc. and has previously interned at Stability AI and the Shanghai Artificial Intelligence Laboratory. This project, Stable Part Diffusion 4D (SP4D), is a collaboration between Stability AI and UIUC, capable of generating temporally and spatially consistent multi-view RGB and kinematic part sequences from monocular video, which can be further enhanced into bindable 3D assets. Personal homepage: https://haoz19.github.io/

In character animation and 3D content creation, rigging and part segmentation are essential for creating animatable assets. However, existing methods have significant limitations:

Automatic Rigging: Dependent on limited-scale 3D datasets and skeletal/skinning annotations, it struggles to cover diverse object forms and complex poses, resulting in insufficient model generalization.
Part Segmentation: Current methods often rely on semantic or appearance features (such as “head,” “tail,” “legs,” etc.) for segmentation, lacking modeling of true kinematic structures, resulting in instability across viewpoints or time sequences, making them challenging to apply directly to animation-driven tasks.

To address this, we propose a core motivation: to leverage large-scale 2D data and the strong prior knowledge of pre-trained diffusion models to tackle the problem of kinematic part segmentation and extend it to automatic rigging. This approach can break through the bottleneck of scarce 3D data, enabling AI to genuinely learn to generate 3D animatable assets that adhere to the laws of physical motion.

Research Methods and Innovations

Based on this motivation, we propose Stable Part Diffusion 4D (SP4D) — the first multi-view video diffusion framework aimed at kinematic part segmentation. Key innovations include:

Dual-Branch Diffusion Architecture: Simultaneously generating appearance and kinematic structure to achieve joint modeling of RGB and parts.
BiDiFuse Bidirectional Fusion Module: Enabling cross-modal interaction between RGB and part information to enhance structural consistency.
Contrastive Consistency Loss: Ensuring that the same part remains stable and consistent across different viewpoints and time.
KinematicParts20K Dataset: The team constructed over 20,000 skeletal annotated objects based on Objaverse-XL, providing high-quality training and evaluation data.

This framework not only generates temporally and spatially consistent part segmentation but also elevates the results to bindable 3D meshes, deriving skeletal structures and skinning weights, which can be directly applied to animation production.

Experimental Results

On the KinematicParts20K validation set, SP4D achieved significant improvements over existing methods:

Segmentation Accuracy: mIoU improved to 0.68, significantly ahead of SAM2 (0.15) and DeepViT (0.17).
Structural Consistency: ARI reached 0.60, far exceeding SAM2’s 0.05.
User Study: On the metrics of “part clarity, cross-view consistency, and animation adaptability,” SP4D averaged a score of 4.26/5, significantly outperforming SAM2 (1.96) and DeepViT (1.85) 2509.10687v1.

In the automatic rigging task, SP4D also demonstrated stronger potential:

On KinematicParts20K-test, SP4D achieved a Rigging Precision of 72.7, showing a clear advantage over Magic Articulate (63.7) and UniRig (64.3).
In user evaluations of animation naturalness, SP4D averaged a score of 4.1/5, far exceeding Magic Articulate (2.7) and UniRig (2.3), showcasing better generalization for unseen categories and complex forms.

These results strongly demonstrate that the 2D prior-driven approach not only addresses the long-standing challenges of kinematic part segmentation but also effectively extends to automatic rigging, promoting full automation in animation and 3D asset generation.

Conclusion

Stable Part Diffusion 4D (SP4D) represents not only a technical breakthrough but also the result of interdisciplinary collaboration, accepted as a Spotlight at Neurips 2025. It showcases how to leverage large-scale 2D priors to open new avenues in 3D kinematic modeling and automatic rigging, laying the foundation for automation and intelligence in fields such as animation, gaming, AR/VR, and robotic simulation.返回搜狐,查看更多



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleBursa starts firmer on Nvidia AI spend and Wall Street gains
Next Article SPATIALGEN: Layout-guided 3D Indoor Scene Generation – Takara TLDR
Advanced AI Editor
  • Website

Related Posts

Stability AI launches Stable audio 2.5 to create instant enterprise soundtracks

September 13, 2025

Stability’s new AI audio tool creates custom sound for brands – how it works

September 11, 2025

Stability AI Launches Stable Audio 2.5 with Enterprise-Grade Speed and Creative Control

September 11, 2025

Comments are closed.

Latest Posts

St. Patrick’s Cathedral Unveils Monumental Mural by Adam Cvijanovic

Three Loaned Banksy Works Incite Dispute Between England and Italy

Major Collection of Old Masters Paintings Could Be Fractionalized

100 Must-See Artworks at the Metropolitan Museum of Art

Latest Posts

After India, OpenAI launches its affordable ChatGPT Go plan in Indonesia

September 23, 2025

What Actually Works in 2025?

September 23, 2025

OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System – Takara TLDR

September 23, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • After India, OpenAI launches its affordable ChatGPT Go plan in Indonesia
  • What Actually Works in 2025?
  • OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System – Takara TLDR
  • Abu Dhabi Lands the Middle East’s First NVIDIA-Backed AI & Robotics lab
  • MIT Joins Wharton At The Top

Recent Comments

  1. remontuem-122 on Anthropic’s popular Claude Code AI tool now included in its $20/month Pro plan
  2. сравнение мальта лицензия казино рейтинги on Stochastic RNNs without Teacher-Forcing
  3. TimothyEvive on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. TimothyVoity on Sam & Jony introduce io
  5. remontuem-244 on Nebius Stock Soars on $1B AI Funding, Analyst Sees 75% Upside

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.