Making, Not Taking, The Best Of N - Takara TLDR

Obtaining high-quality generations in modern LLMs has largely been framed as
a selection problem: identifying a single winning generation from a diverse
pool of N samples, the Best-of-N (BoN). Yet, this approach is inherently
zero-sum, discarding diverse and potentially useful information from the pool.
Instead, we explore a collaborative setup, where all candidates can potentially
contribute to the final winning generation. To this end, we propose Fusion-of-N
(FusioN): a method that uses a general LLM judge to synthesize the most
informative elements of each sample into a single final answer. We compare
FusioN to BoN in two settings, (i) test-time scaling, where we sample and
aggregate from a single model at test-time (ii) synthetic data generation,
where we fuse samples from a pool of diverse teachers to improve a student
model. We extensively benchmark both setups across 11 languages, 3 diverse
tasks and varying model scales. Across the bench, FusioN consistently
outperforms BoN showing versatility and robustness both in test-time scaling
and in downstream gains from synthetic data generation. We also perform
extensive analysis on FusioN, where it shows surprising strengths and
robustness under challenging settings. These results show that we should shift
how we think about evaluating and utilizing LLM generations from a monolithic
measure of quality, to embracing their polylithic nature. This shift allows us
to integrate diverse strengths, unlock latent potential, and achieve
improvements that were previously inaccessible through selection alone.

Source link

What's Hot

In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning – Takara TLDR

Massimo deploys Claude AI to strengthen dealer, customer support

The Trump administration has quietly offered MIT a deal that could transform campus life

Making, not Taking, the Best of N – Takara TLDR

In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning – Takara TLDR

GEM: A Gym for Agentic LLMs – Takara TLDR

Code2Video: A Code-centric Paradigm for Educational Video Generation – Takara TLDR

Sotheby’s Sells York Avenue HQ to Weill Cornell, Prepares Breuer Move

Outsider Art Fair’s New Director Elizabeth Denny Discusses Her Role

50 Pianos Sound Off in ’11,000 Strings’ at the Park Avenue Armory

Five Arts and Culture Nonprofits Join NYC’s Cultural Institutions Group

In-Place Feedback: A New Paradigm for Guiding LLMs in Multi-Turn Reasoning – Takara TLDR

Massimo deploys Claude AI to strengthen dealer, customer support

The Trump administration has quietly offered MIT a deal that could transform campus life

What's Hot

Making, not Taking, the Best of N – Takara TLDR

Related Posts

Subscribe to Updates