Paper page - Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering

The collaborative paradigm of large and small language models (LMs)
effectively balances performance and cost, yet its pivotal challenge lies in
precisely pinpointing the moment of invocation when hallucinations arise in
small LMs. Previous optimization efforts primarily focused on post-processing
techniques, which were separate from the reasoning process of LMs, resulting in
high computational costs and limited effectiveness. In this paper, we propose a
practical invocation evaluation metric called AttenHScore, which calculates the
accumulation and propagation of hallucinations during the generation process of
small LMs, continuously amplifying potential reasoning errors. By dynamically
adjusting the detection threshold, we achieve more accurate real-time
invocation of large LMs. Additionally, considering the limited reasoning
capacity of small LMs, we leverage uncertainty-aware knowledge reorganization
to assist them better capture critical information from different text chunks.
Extensive experiments reveal that our AttenHScore outperforms most baseline in
enhancing real-time hallucination detection capabilities across multiple QA
datasets, especially when addressing complex queries. Moreover, our strategies
eliminate the need for additional model training and display flexibility in
adapting to various transformer-based LMs.

Source link

What's Hot

Mistral AI CEO Says AI’s Biggest Threat Is People Getting Lazy

Control of Newtonian fluids with minimum force impact using the Navier Stokes equations

How to generate AI videos from photos using Google Gemini: 5 easy steps on PC, Android and iPhone – Technology News

Paper page – Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering

Paper page – Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Paper page – Beyond the Linear Separability Ceiling

Paper page – Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Homeland Security Targets Chicago’s National Museum of Puerto Rican Arts & Culture

1,600-Year-Old Tomb of Mayan City’s Founding King Discovered in Belize

Centre Pompidou Cancels Caribbean Art Show, Raising Controversy

‘Night at the Museum’ Reboot in the Works

Mistral AI CEO Says AI’s Biggest Threat Is People Getting Lazy

Control of Newtonian fluids with minimum force impact using the Navier Stokes equations

How to generate AI videos from photos using Google Gemini: 5 easy steps on PC, Android and iPhone – Technology News

What's Hot

Paper page – Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering

Related Posts

Subscribe to Updates