Optimizing What Matters: AUC-Driven Learning For Robust Neural Retrieval - Takara TLDR

Dual-encoder retrievers depend on the principle that relevant documents
should score higher than irrelevant ones for a given query. Yet the dominant
Noise Contrastive Estimation (NCE) objective, which underpins Contrastive Loss,
optimizes a softened ranking surrogate that we rigorously prove is
fundamentally oblivious to score separation quality and unrelated to AUC. This
mismatch leads to poor calibration and suboptimal performance in downstream
tasks like retrieval-augmented generation (RAG). To address this fundamental
limitation, we introduce the MW loss, a new training objective that maximizes
the Mann-Whitney U statistic, which is mathematically equivalent to the Area
under the ROC Curve (AUC). MW loss encourages each positive-negative pair to be
correctly ranked by minimizing binary cross entropy over score differences. We
provide theoretical guarantees that MW loss directly upper-bounds the AoC,
better aligning optimization with retrieval goals. We further promote ROC
curves and AUC as natural threshold free diagnostics for evaluating retriever
calibration and ranking quality. Empirically, retrievers trained with MW loss
consistently outperform contrastive counterparts in AUC and standard retrieval
metrics. Our experiments show that MW loss is an empirically superior
alternative to Contrastive Loss, yielding better-calibrated and more
discriminative retrievers for high-stakes applications like RAG.

Source link

What's Hot

Hcltech Joins Mit Media Lab in the Us to Collaborate on Next-gen Ai Research

IBM Releases Open-Source Granite 4.0 Generative AI

New AI training method creates powerful software agents with just 78 examples

Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval – Takara TLDR

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents – Takara TLDR

Go with Your Gut: Scaling Confidence for Autoregressive Image Generation – Takara TLDR

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation – Takara TLDR

Tomb of Amenhotep III Reopens After Two-Decade Renovation

Limited Edition Print of Ozzy Osbourne Art Sold To Benefit Charities

Odili Donald Odita Sues Jack Shainman Gallery over ‘Withheld’ Artworks

Mohamed Hamidi, Moroccan Modernist Painter, Has Died at 84

Hcltech Joins Mit Media Lab in the Us to Collaborate on Next-gen Ai Research

IBM Releases Open-Source Granite 4.0 Generative AI

New AI training method creates powerful software agents with just 78 examples

What's Hot

Optimizing What Matters: AUC-Driven Learning for Robust Neural Retrieval – Takara TLDR

Related Posts

Subscribe to Updates