Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

It’s too expensive to fight every AI copyright battle, Getty CEO says

Gemma 3N: Google’s Latest On Device Mobile AI Model

Mistral AI launches code embedding model, claims edge over OpenAI and Cohere – Computerworld

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Amazon AWS AI
    • Anthropic (Claude)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • Cohere
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Advanced AI News
Home » s3: The new RAG framework that trains search agents with minimal data
VentureBeat AI

s3: The new RAG framework that trains search agents with minimal data

Advanced AI BotBy Advanced AI BotMay 28, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Researchers at University of Illinois Urbana-Champaign have introduced s3, an open-source framework designed to build retrieval-augmented generation (RAG) systems more efficiently than current methods. 

s3 can benefit developers creating real-world large language model (LLM) applications, as it simplifies and reduces the cost of creating retriever models within RAG architectures.

RAG retrieval

The effectiveness of any RAG system hinges on the quality of its retrieval component. In their paper, the researchers categorize the evolution of RAG approaches into three distinct phases.

“Classic RAG” systems rely on static retrieval methods with fixed queries, where retrieval quality is disconnected from the ultimate generation performance. These architectures struggle with queries requiring contextual or multi-hop reasoning.

A subsequent phase, dubbed “Pre-RL-Zero,” introduces more active LLM participation during inference. These techniques involved multi-turn interactions, interleaving query generation, retrieval, and reasoning. However, they typically depend on zero-shot prompting and lack trainable components to optimize retrieval through direct outcome signals.

The most recent phase, “RL-Zero,” leverages reinforcement learning (RL) to train models to act as search agents, improving through outcome-based feedback like answer correctness. An example is Search-R1, which trains the model to interleave reasoning with search queries and retrieved context.

Despite their advancements, existing RL-Zero approaches often optimize retrieval using search-centric metrics that ignore downstream utility. Moreover, they require fine-tuning the LLM, which is costly and error-prone. By entangling retrieval with generation, they limit real search utility and compatibility with frozen or proprietary models. 

Different types of RAG (source: arXiv)
Different types of RAG Source: arXiv

As the researchers put it, “This motivates a shift toward a modular framework where search and generation are cleanly separated, and optimization focuses purely on search quality with respect to downstream utility.”

s3

The s3 framework addresses this challenge with a model-agnostic approach. The main idea is to train a search agent with structured, multi-turn access to external knowledge. This search agent improves the quality of the retrieval stage without affecting the LLM that generates the final answer.

In s3, a dedicated searcher LLM iteratively interacts with a search engine. It generates queries based on the prompt, retrieves relevant documents, selects a useful subset of evidence, and decides whether to continue searching for more information. Once the search concludes, a separate, frozen generator LLM consumes this accumulated evidence to produce the final answer.

s3 framework (source: arXiv)
s3 framework Source: arXiv

A core innovation of s3 is its reward signal, Gain Beyond RAG (GBR). GBR quantifies the improvement in the generator’s accuracy when conditioned on documents retrieved by s3, compared to a baseline that retrieves the top documents matching the query. This reward incentivizes the searcher to find documents that truly enhance the generator’s output quality. 

“s3 decouples the retriever (searcher) from the generator. This lets companies plug in any off-the-shelf or proprietary LLM—whether GPT-4, Claude, or an internal model—without having to fine-tune it,” Patrick (Pengcheng) Jiang, lead author of the paper and doctoral student at UIUC, told VentureBeat. “For enterprises with regulatory or contractual constraints on model modification, or those that rely on closed-source LLM APIs, this modularity makes s3 highly practical. It allows them to enhance search quality without touching their generation infrastructure.”

s3 in action

The researchers tested s3 across six general-domain question-answering benchmarks, comparing it against three categories of RAG systems: End-to-end fine-tuning (e.g., Search-R1), static retrieval with frozen generators (such as classic RAG pipelines) and active retrieval with frozen generators (e.g., combining documents obtained by Search-R1 with a frozen LLM). In their experiments, they used Qwen2.5-7B-Instruct as the base model for the searcher and Qwen2.5-14B-Instruct and Claude 3 Haiku as the frozen generator LLMs.

s3 surpassed static, zero-shot and end-to-end tuned baselines on most benchmarks and achieved an average score. Its data efficiency is particularly noteworthy: s3 achieved strong gains with only 2.4k training examples, significantly less than the 70k examples required by DeepRetrieval (a static retrieval framework) or the 170k needed by Search-R1, while outperforming both in context quality and final answer performance.

s3 vs other RAG techniques (source: GitHub)
s3 vs other RAG techniques Source: GitHub

“Many enterprises lack large-scale annotated QA datasets or the GPU infrastructure to fine-tune end-to-end LLM systems. s3 lowers the barrier by enabling strong retrieval performance with minimal supervision and compute,” Jiang said. “This means faster prototyping, reduced costs and quicker time-to-deployment for AI-powered search applications.”

The findings suggest a fundamental shift in optimization strategy. As the researchers note in the paper, most of the performance gain in RAG stems from “improving the search capability instead of aligning generation outputs,” which implies that focusing RL on search strategy rather than combined generation alignment yields better results.

Another crucial finding for enterprise applications is s3’s ability to generalize to domains it has not been trained on. s3 showed zero-shot success on medical QA despite training only on general QA, suggesting that “reinforcement-learned search skills generalize more reliably than generation-tuned approaches,” according to the researchers. 

This cross-domain adaptability makes s3 well-suited for specialized enterprise applications that often deal with proprietary or bespoke datasets without requiring extensive domain-specific training data. This means that a single trained searcher could serve different departments (e.g., legal, HR, customer support) or adapt to evolving content such as new product documents. 

“We see immediate potential in healthcare, enterprise knowledge management, and scientific research support, where high retrieval quality is critical and labeled data is often scarce,” Jiang said.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleDeepSeek: Everything you need to know about the AI chatbot app
Next Article Alphabet (GOOGL) Looks to Checkmate OpenAI With AI Search Strategy
Advanced AI Bot
  • Website

Related Posts

ElevenLabs debuts Conversational AI 2.0 voice assistants that understand when to pause, speak, and take turns talking

May 31, 2025

QwenLong-L1 solves long-context reasoning challenge that stumps current LLMs

May 31, 2025

Which LLM should you use? Token Monster automatically combines multiple models and tools for you

May 30, 2025
Leave A Reply Cancel Reply

Latest Posts

Trump Fires National Portrait Gallery Director Kim Sajet

Ukrainian Tradition Reimagined—Worn By Icons, Loved Worldwide

Lorde’s ‘Man Of The Year’ Video References De Maria’s ‘Earth Room’

David Lynch’s Personal Items Hit the Auction Block in Los Angeles

Latest Posts

It’s too expensive to fight every AI copyright battle, Getty CEO says

May 31, 2025

Gemma 3N: Google’s Latest On Device Mobile AI Model

May 31, 2025

Mistral AI launches code embedding model, claims edge over OpenAI and Cohere – Computerworld

May 31, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.