EgoNight: Towards Egocentric Vision Understanding At Night With A Challenging Benchmark - Takara TLDR

Most existing benchmarks for egocentric vision understanding focus primarily
on daytime scenarios, overlooking the low-light conditions that are inevitable
in real-world applications. To investigate this gap, we present EgoNight, the
first comprehensive benchmark for nighttime egocentric vision, with visual
question answering (VQA) as the core task. A key feature of EgoNight is the
introduction of day-night aligned videos, which enhance night annotation
quality using the daytime data and reveal clear performance gaps between
lighting conditions. To achieve this, we collect both synthetic videos rendered
by Blender and real-world recordings, ensuring that scenes and actions are
visually and temporally aligned. Leveraging these paired videos, we construct
EgoNight-VQA, supported by a novel day-augmented night auto-labeling engine and
refinement through extensive human verification. Each QA pair is double-checked
by annotators for reliability. In total, EgoNight-VQA contains 3658 QA pairs
across 90 videos, spanning 12 diverse QA types, with more than 300 hours of
human work. Evaluations of state-of-the-art multimodal large language models
(MLLMs) reveal substantial performance drops when transferring from day to
night, underscoring the challenges of reasoning under low-light conditions.
Beyond VQA, EgoNight also introduces two auxiliary tasks, day-night
correspondence retrieval and egocentric depth estimation at night, that further
explore the boundaries of existing models. We believe EgoNight-VQA provides a
strong foundation for advancing application-driven egocentric vision research
and for developing models that generalize across illumination domains. All the
data and code will be made available upon acceptance.

Source link

What's Hot

TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning – Takara TLDR

The Nobel Prize in chemistry will be announced Wednesday

Nasdaq, S&P 500 Post New Closing Highs to Begin Week; AMD Soars on OpenAI Deal; Gold, Bitcoin Rise to Records

EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark – Takara TLDR

TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning – Takara TLDR

Good Intentions Beyond ACL: Who Does NLP for Social Good, and Where? – Takara TLDR

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models – Takara TLDR

Matthiesen Gallery Files Lawsuit Over Gustave Courbet Painting

Basquiat Work on Paper Headline’s Phillips’ Frieze Week Sales

Charges Against Isaac Wright ‘to Be Dropped’ After His Arrest by NYPD

What the Los Angeles Wildfires Taught the Art Insurance Industry

TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning – Takara TLDR

The Nobel Prize in chemistry will be announced Wednesday

Nasdaq, S&P 500 Post New Closing Highs to Begin Week; AMD Soars on OpenAI Deal; Gold, Bitcoin Rise to Records

What's Hot

EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark – Takara TLDR

Related Posts

Subscribe to Updates