arXiv AI

Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation

By Advanced AI EditorJuly 1, 2025No Comments2 Mins Read

[Submitted on 26 Jun 2025 (v1), last revised 28 Jun 2025 (this version, v2)]

View a PDF of the paper titled HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation, by Xinzhuo Li and 5 other authors

View PDF
HTML (experimental)

Abstract:Recent progress in vision-language segmentation has significantly advanced grounded visual understanding. However, these models often exhibit hallucinations by producing segmentation masks for objects not grounded in the image content or by incorrectly labeling irrelevant regions. Existing evaluation protocols for segmentation hallucination primarily focus on label or textual hallucinations without manipulating the visual context, limiting their capacity to diagnose critical failures. In response, we introduce HalluSegBench, the first benchmark specifically designed to evaluate hallucinations in visual grounding through the lens of counterfactual visual reasoning. Our benchmark consists of a novel dataset of 1340 counterfactual instance pairs spanning 281 unique object classes, and a set of newly introduced metrics that quantify hallucination sensitivity under visually coherent scene edits. Experiments on HalluSegBench with state-of-the-art vision-language segmentation models reveal that vision-driven hallucinations are significantly more prevalent than label-driven ones, with models often persisting in false segmentation, highlighting the need for counterfactual reasoning to diagnose grounding fidelity.

Submission history

From: Adheesh Juvekar [view email]
[v1]
Thu, 26 Jun 2025 17:59:12 UTC (20,978 KB)
[v2]
Sat, 28 Jun 2025 15:32:51 UTC (20,978 KB)

Previous ArticleTesla China registrations hit 20.7k in final week of June, highest in Q2

Next Article Who is part of Meta’s AI ‘dream team’? Full list of researchers poached from OpenAI, Google DeepMind | Technology News

Advanced AI Editor

Leave A Reply