Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Researchers at Alibaba Group have developed a novel approach that could dramatically reduce the cost and complexity of training AI systems to search for information, eliminating the need for expensive commercial search engine APIs altogether.
The technique, called “ZeroSearch,” allows large language models (LLMs) to develop advanced search capabilities through a simulation approach rather than interacting with real search engines during the training process. This innovation could save companies significant API expenses while offering better control over how AI systems learn to retrieve information.
“Reinforcement learning [RL] training requires frequent rollouts, potentially involving hundreds of thousands of search requests, which incur substantial API expenses and severely constrain scalability,” write the researchers in their paper published on arXiv this week. “To address these challenges, we introduce ZeroSearch, a reinforcement learning framework that incentivizes the search capabilities of LLMs without interacting with real search engines.”
Alibaba just dropped ZeroSearch on Hugging Face
Incentivize the Search Capability of LLMs without Searching pic.twitter.com/QfniJNO3LH
— AK (@_akhaliq) May 8, 2025
How ZeroSearch trains AI to search without search engines
The problem that ZeroSearch solves is significant. Companies developing AI assistants that can autonomously search for information face two major challenges: the unpredictable quality of documents returned by search engines during training, and the prohibitively high costs of making hundreds of thousands of API calls to commercial search engines like Google.
Alibaba’s approach begins with a lightweight supervised fine-tuning process to transform an LLM into a retrieval module capable of generating both relevant and irrelevant documents in response to a query. During reinforcement learning training, the system employs what the researchers call a “curriculum-based rollout strategy” that gradually degrades the quality of generated documents.
“Our key insight is that LLMs have acquired extensive world knowledge during large-scale pretraining and are capable of generating relevant documents given a search query,” the researchers explain. “The primary difference between a real search engine and a simulation LLM lies in the textual style of the returned content.”
Outperforming Google at a fraction of the cost
In comprehensive experiments across seven question-answering datasets, ZeroSearch not only matched but often surpassed the performance of models trained with real search engines. Remarkably, a 7B-parameter retrieval module achieved performance comparable to Google Search, while a 14B-parameter module even outperformed it.
The cost savings are substantial. According to the researchers’ analysis, training with approximately 64,000 search queries using Google Search via SerpAPI would cost about $586.70, while using a 14B-parameter simulation LLM on four A100 GPUs costs only $70.80 — an 88% reduction.
“This demonstrates the feasibility of using a well-trained LLM as a substitute for real search engines in reinforcement learning setups,” the paper notes.
What this means for the future of AI development
This breakthrough is a major shift in how AI systems can be trained. ZeroSearch shows that AI can improve without depending on external tools like search engines.
The impact could be substantial for the AI industry. Until now, training advanced AI systems often required expensive API calls to services controlled by big tech companies. ZeroSearch changes this equation by allowing AI to simulate search instead of using actual search engines.
For smaller AI companies and startups with limited budgets, this approach could level the playing field. The high costs of API calls have been a major barrier to entry in developing sophisticated AI assistants. By cutting these costs by nearly 90%, ZeroSearch makes advanced AI training more accessible.
Beyond cost savings, this technique gives developers more control over the training process. When using real search engines, the quality of returned documents is unpredictable. With simulated search, developers can precisely control what information the AI sees during training.
The technique works across multiple model families, including Qwen-2.5 and LLaMA-3.2, and with both base and instruction-tuned variants. The researchers have made their code, datasets, and pre-trained models available on GitHub and Hugging Face, allowing other researchers and companies to implement the approach.
As large language models continue to evolve, techniques like ZeroSearch suggest a future where AI systems can develop increasingly sophisticated capabilities through self-simulation rather than relying on external services — potentially changing the economics of AI development and reducing dependencies on large technology platforms.
The irony is clear: in teaching AI to search without search engines, Alibaba may have created a technology that makes traditional search engines less necessary for AI development. As these systems become more self-sufficient, the technology landscape could look very different in just a few years.