Paper page - Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding

Code & Resources: https://github.com/F2-Song/Weak-to-Strong-Decoding

Large Language Models (LLMs) require alignment with human preferences to avoid generating offensive, false, or meaningless content. Recently, low-resource methods for LLM alignment have been popular, while still facing challenges in obtaining both high-quality and aligned content. Motivated by the observation that the difficulty of generating aligned responses is concentrated at the beginning of decoding, we propose a novel framework, Weak-to-Strong Decoding (WSD), to enhance the alignment ability of base models by the guidance of a small aligned model. The small model first drafts well-aligned beginnings, followed by the large base model to continue the rest, controlled by a well-designed auto-switch mechanism. We also collect a new dataset, GenerAlign, to fine-tune a small-sized Pilot-3B as the draft model, which effectively enhances different base models under the WSD framework to outperform all baseline methods, while avoiding degradation on downstream tasks, termed as the alignment tax. Extensive experiments are further conducted to examine the impact of different settings and time efficiency, as well as analyses on the intrinsic mechanisms of WSD in depth.

Source link

What's Hot

Bell and Cohere team up to offer Canadian-built AI for business and government

Aurora Mobile’s GPTBots.ai to Integrate Zhipu AI’s Flagship GLM-4.5 Model to Enhance AI Capabilities

‘How Did We Get Here?’ AL + Electra Japonas Discuss – Artificial Lawyer

Paper page – Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding

AnimalClue: Recognizing Animals by their Traces

Paper page – MOVE: Motion-Guided Few-Shot Video Object Segmentation

Paper page – Music Arena: Live Evaluation for Text-to-Music

Artlogic, ArtCloud Merge in Bid to Shape Art World’s Digital Backbone

Met Museum Trustee Among Those Killed in NYC Shooting

John Roberts Prevented Firing of National Portrait Gallery Director

At Comic-Con, George Lucas Previews Forthcoming Lucas Museum

Bell and Cohere team up to offer Canadian-built AI for business and government

Aurora Mobile’s GPTBots.ai to Integrate Zhipu AI’s Flagship GLM-4.5 Model to Enhance AI Capabilities

‘How Did We Get Here?’ AL + Electra Japonas Discuss – Artificial Lawyer

What's Hot

Paper page – Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding

Related Posts

Subscribe to Updates