Towards More Diverse And Challenging Pre-training For Point Cloud Learning: Self-Supervised Cross Reconstruction With Decoupled Views - Takara TLDR

Point cloud learning, especially in a self-supervised way without manual
labels, has gained growing attention in both vision and learning communities
due to its potential utility in a wide range of applications. Most existing
generative approaches for point cloud self-supervised learning focus on
recovering masked points from visible ones within a single view. Recognizing
that a two-view pre-training paradigm inherently introduces greater diversity
and variance, it may thus enable more challenging and informative pre-training.
Inspired by this, we explore the potential of two-view learning in this domain.
In this paper, we propose Point-PQAE, a cross-reconstruction generative
paradigm that first generates two decoupled point clouds/views and then
reconstructs one from the other. To achieve this goal, we develop a crop
mechanism for point cloud view generation for the first time and further
propose a novel positional encoding to represent the 3D relative position
between the two decoupled views. The cross-reconstruction significantly
increases the difficulty of pre-training compared to self-reconstruction, which
enables our method to surpass previous single-modal self-reconstruction methods
in 3D self-supervised learning. Specifically, it outperforms the
self-reconstruction baseline (Point-MAE) by 6.5%, 7.0%, and 6.7% in three
variants of ScanObjectNN with the Mlp-Linear evaluation protocol. The code is
available at https://github.com/aHapBean/Point-PQAE.

Source link

What's Hot

C3.ai (AI) Q1 2026 Earnings Call Transcript

Improving Large Vision and Language Models by Learning from a Panel of Peers – Takara TLDR

Enhancing LLM accuracy with Coveo Passage Retrieval on Amazon Bedrock

Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views – Takara TLDR

Improving Large Vision and Language Models by Learning from a Panel of Peers – Takara TLDR

Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing – Takara TLDR

MedDINOv3: How to adapt vision foundation models for medical image segmentation? – Takara TLDR

Nazi-Looted Painting from Argentine Home May Have Been Recovered

Moche Residence Unearthed at Archaeological Site in Northern Peru

Kim Sajet to Helm the Milwaukee Art Museum

Armory Show to ‘Complicate Stereotypes,’ and More Art News

C3.ai (AI) Q1 2026 Earnings Call Transcript

Improving Large Vision and Language Models by Learning from a Panel of Peers – Takara TLDR

Enhancing LLM accuracy with Coveo Passage Retrieval on Amazon Bedrock

What's Hot

Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views – Takara TLDR

Related Posts

Subscribe to Updates