Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

‘AI Model Fine-Tuning Is Overrated’ – Artificial Lawyer

VChain: Chain-of-Visual-Thought for Reasoning in Video Generation – Takara TLDR

OpenAI’s Blockbuster AMD Deal Is a Bet on Near-Limitless Demand for AI

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
VentureBeat AI

New AI training method creates powerful software agents with just 78 examples

By Advanced AI EditorOctober 7, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email



A new study by Shanghai Jiao Tong University and SII Generative AI Research Lab (GAIR) shows that training large language models (LLMs) for complex, autonomous tasks does not require massive datasets.

Their framework, LIMI (Less Is More for Intelligent Agency), builds on similar work in other areas of LLM research and finds that “machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations.” 

In other words, it's data quality, not quantity, that matters.

In experiments, the researchers found that with a small, but carefully curated, dataset of just 78 examples, they could train LLMs to outperform models trained on thousands of examples by a considerable margin on key industry benchmarks.

This discovery could have important implications for enterprise applications where data is scarce or expensive to collect.

The challenge of building agents that work

The researchers define agency as “the emergent capacity of AI systems to function as autonomous agents–actively discovering problems, formulating hypotheses, and executing solutions through self-directed engagement with environments and tools.” In other words, these are AI systems that “don’t just think, but work.” 

The problem is that current training frameworks assume that higher agentic intelligence requires a lot of data, as has been shown in the classic scaling laws of language modeling. The researchers argue that this approach leads to increasingly complex training pipelines and substantial resource requirements. Moreover, in many areas, data is not abundant, hard to obtain, and very expensive to curate.

However, research in other domains suggests that you don’t necessarily require more data to achieve training objectives in LLM training.

For example, LIMA, a 2023 paper, showed a model could be effectively aligned with just 1,000 curated examples. More recently, LIMO demonstrated that complex mathematical reasoning could emerge from only 817 training samples.

With LIMI, the researchers sought to apply the same “less is more” principle to the complex world of AI agents.

How LIMI works

The LIMI framework demonstrates that sophisticated agentic intelligence can emerge from minimal but strategically curated demonstrations of autonomous behavior. Key to the framework is a pipeline for collecting high-quality demonstrations of agentic tasks. 

Each demonstration consists of two parts: a query and a trajectory. A query is a natural language request from a user, such as a software development requirement or a scientific research goal.

The trajectory is the series of steps the AI takes to address the query, including its internal reasoning, its calls to external tools like a code interpreter, and the observations it receives from the environment. For example, a query might be "build a simple chat application," and the trajectory would include the agent’s internal reasoning and action plan, the code it writes and executes, and the resulting output or errors.

The trajectory could include multiple iterations of planning, execution, and reflection until it achieves the desired objective.

To build their dataset, the researchers started with 60 queries from real-world scenarios faced by professional developers and researchers. They then expanded this pool by using GPT-5 to synthesize additional queries from GitHub Pull Requests.

They employed a team of four computer science PhD students to vet the quality of these queries and choose 18 examples to create a high-quality set of 78 queries focused on software development and research workflows.

To generate the trajectories, the same PhD students collaborated with a CLI coding agent powered by GPT-5 to complete the 78 tasks.

They followed an iterative process, collecting the entire interaction sequence until each task was successfully completed, capturing the full arc of realistic human-AI collaboration, including back-and-forth communication and iterative refinement. For the more complex queries, the collected trajectories could extend to more than 152,000 tokens.

“This approach guarantees that our models learn not only from successful outcomes but also from the complete problem-solving process, including how to adapt strategies and recover from failures during collaborative execution,” the researchers write.

LIMI in action

To test their framework, the team evaluated models on AgencyBench, a benchmark designed for measuring agentic skills, as well as other established benchmarks for tool use and coding.

They fine-tuned GLM-4.5, a powerful open-source model, using their 78-sample dataset and compared its performance against several frontier models, including the base GLM-4.5, Kimi-K2-Instruct, and DeepSeek-V3.1. The LIMI-trained model achieved an average score of 73.5% on AgencyBench, significantly outperforming all baseline models, the best of which (GLM-4.5) scored 45.1%.

This superiority extended to other benchmarks covering tool use, coding, and scientific computing, where LIMI also outperformed all baselines.

More importantly, the study showed that the model trained on just 78 examples outperformed models trained with 10,000 samples from another dataset, delivering superior performance with 128 times less data. 

“This discovery fundamentally reshapes how we develop autonomous AI systems, suggesting that mastering agency requires understanding its essence, not scaling training data,” the researchers write. “As industries transition from thinking AI to working AI, LIMI provides a paradigm for sustainable cultivation of truly agentic intelligence.”

The researchers have released the code for the data synthesis and training and model weights. For the enterprise, this approach offers a practical path toward developing highly specialized AI agents.

Instead of undertaking massive data collection projects, organizations can leverage their in-house talent and subject matter experts to create small, high-quality datasets for bespoke agentic tasks. This lowers the barrier to entry and enables businesses to build custom AI agents that can provide a competitive edge on the workflows that matter most to them.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleSam Altman says ChatGPT has hit 800M weekly active users 
Next Article IBM Releases Open-Source Granite 4.0 Generative AI
Advanced AI Editor
  • Website

Related Posts

Google's Jules coding agent moves beyond chat with new command line and API

October 7, 2025

OpenAI's DevDay 2025 preview: Will Sam Altman launch the ChatGPT browser?

October 7, 2025

From Silicon Valley to Nairobi: What the Global South’s AI leapfrogging teaches tech leaders

October 7, 2025

Comments are closed.

Latest Posts

Tomb of Amenhotep III Reopens After Two-Decade Renovation    

Limited Edition Print of Ozzy Osbourne Art Sold To Benefit Charities

Odili Donald Odita Sues Jack Shainman Gallery over ‘Withheld’ Artworks

Mohamed Hamidi, Moroccan Modernist Painter, Has Died at 84

Latest Posts

‘AI Model Fine-Tuning Is Overrated’ – Artificial Lawyer

October 7, 2025

VChain: Chain-of-Visual-Thought for Reasoning in Video Generation – Takara TLDR

October 7, 2025

OpenAI’s Blockbuster AMD Deal Is a Bet on Near-Limitless Demand for AI

October 7, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • ‘AI Model Fine-Tuning Is Overrated’ – Artificial Lawyer
  • VChain: Chain-of-Visual-Thought for Reasoning in Video Generation – Takara TLDR
  • OpenAI’s Blockbuster AMD Deal Is a Bet on Near-Limitless Demand for AI
  • Top MIT Researcher Shows Decentralization Could Speed Up Ethereum, Solana
  • Google's Jules coding agent moves beyond chat with new command line and API

Recent Comments

  1. Markerthree6Nalay on Implement human-in-the-loop confirmation with Amazon Bedrock Agents
  2. Oleta Brecht on Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation
  3. Nathanial Weyand on Nuclear power investment is growing. These stocks offer exposure
  4. Edwardo Rokicki on C3 AI and Arcfield Announce Partnership to Accelerate AI Capabilities to Serve U.S. Defense and Intelligence Communities
  5. Bingerman4Nalay on This AI Hallucinates Images For You

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.