We present the evaluation methodology, datasets and results of the BOP
Challenge 2024, the sixth in a series of public competitions organized to
capture the state of the art in 6D object pose estimation and related tasks. In
2024, our goal was to transition BOP from lab-like setups to real-world
scenarios. First, we introduced new model-free tasks, where no 3D object models
are available and methods need to onboard objects just from provided reference
videos. Second, we defined a new, more practical 6D object detection task where
identities of objects visible in a test image are not provided as input. Third,
we introduced new BOP-H3 datasets recorded with high-resolution sensors and
AR/VR headsets, closely resembling real-world scenarios. BOP-H3 include 3D
models and onboarding videos to support both model-based and model-free tasks.
Participants competed on seven challenge tracks, each defined by a task, object
onboarding setup, and dataset group. Notably, the best 2024 method for
model-based 6D localization of unseen objects (FreeZeV2.1) achieves 22% higher
accuracy on BOP-Classic-Core than the best 2023 method (GenFlow), and is only
4% behind the best 2023 method for seen objects (GPose2023) although being
significantly slower (24.9 vs 2.7s per image). A more practical 2024 method for
this task is Co-op which takes only 0.8s per image and is 25X faster and 13%
more accurate than GenFlow. Methods have a similar ranking on 6D detection as
on 6D localization but higher run time. On model-based 2D detection of unseen
objects, the best 2024 method (MUSE) achieves 21% relative improvement compared
to the best 2023 method (CNOS). However, the 2D detection accuracy for unseen
objects is still noticealy (-53%) behind the accuracy for seen objects
(GDet2023). The online evaluation system stays open and is available at
http://bop.felk.cvut.cz/