Paper page - LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS

LangSplatV2 enhances 3D text querying speed and accuracy by replacing the heavyweight decoder with a sparse coefficient field and efficient CUDA optimization.

In this paper, we introduce LangSplatV2, which achieves high-dimensional
feature splatting at 476.2 FPS and 3D open-vocabulary text querying at 384.6
FPS for high-resolution images, providing a 42 times speedup and a 47
times boost over LangSplat respectively, along with improved query accuracy.
LangSplat employs Gaussian Splatting to embed 2D CLIP language features into
3D, significantly enhancing speed and learning a precise 3D language field with
SAM semantics. Such advancements in 3D language fields are crucial for
applications that require language interaction within complex scenes. However,
LangSplat does not yet achieve real-time inference performance (8.2 FPS), even
with advanced A100 GPUs, severely limiting its broader application. In this
paper, we first conduct a detailed time analysis of LangSplat, identifying the
heavyweight decoder as the primary speed bottleneck. Our solution, LangSplatV2
assumes that each Gaussian acts as a sparse code within a global dictionary,
leading to the learning of a 3D sparse coefficient field that entirely
eliminates the need for a heavyweight decoder. By leveraging this sparsity, we
further propose an efficient sparse coefficient splatting method with CUDA
optimization, rendering high-dimensional feature maps at high quality while
incurring only the time cost of splatting an ultra-low-dimensional feature. Our
experimental results demonstrate that LangSplatV2 not only achieves better or
competitive query accuracy but is also significantly faster. Codes and demos
are available at our project page: https://langsplat-v2.github.io.

Source link

What's Hot

TU Wien Rendering #7 – Ray-Sphere Intersection

Coinbase and Perplexity AI Unite for Live Crypto Price Access

Paper page – Beyond the Linear Separability Ceiling

Paper page – LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS

Paper page – Beyond the Linear Separability Ceiling

Paper page – Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Paper page – PyVision: Agentic Vision with Dynamic Tooling

Homeland Security Targets Chicago’s National Museum of Puerto Rican Arts & Culture

1,600-Year-Old Tomb of Mayan City’s Founding King Discovered in Belize

Centre Pompidou Cancels Caribbean Art Show, Raising Controversy

‘Night at the Museum’ Reboot in the Works

TU Wien Rendering #7 – Ray-Sphere Intersection

Coinbase and Perplexity AI Unite for Live Crypto Price Access

Paper page – Beyond the Linear Separability Ceiling

What's Hot

Paper page – LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS

Related Posts

Subscribe to Updates