ObjectGS combines 3D scene reconstruction with semantic understanding by modeling individual objects as neural Gaussians, achieving superior performance in segmentation and integration with applications like mesh extraction and scene editing.
3D Gaussian Splatting is renowned for its high-fidelity reconstructions and
real-time novel view synthesis, yet its lack of semantic understanding limits
object-level perception. In this work, we propose ObjectGS, an object-aware
framework that unifies 3D scene reconstruction with semantic understanding.
Instead of treating the scene as a unified whole, ObjectGS models individual
objects as local anchors that generate neural Gaussians and share object IDs,
enabling precise object-level reconstruction. During training, we dynamically
grow or prune these anchors and optimize their features, while a one-hot ID
encoding with a classification loss enforces clear semantic constraints. We
show through extensive experiments that ObjectGS not only outperforms
state-of-the-art methods on open-vocabulary and panoptic segmentation tasks,
but also integrates seamlessly with applications like mesh extraction and scene
editing. Project page: https://ruijiezhu94.github.io/ObjectGS_page