
(Source: Shutterstock)
The Allen Institute for AI (Ai2) has introduced Asta, a new AI platform that combines an agentic research assistant, a benchmark suite for evaluating scientific agents, and developer resources for building and testing tools. Ai2 positions Asta as a way to make agentic workflows more transparent and reproducible for research tasks.
At the center of this release is the Asta assistant, an AI agent built for common scientific workflows. The assistant is designed to help scientists find relevant papers, generate literature summaries with citations, and, in a limited beta, run basic data analyses. Ai2 says these functions are intended “to work the way scientists think,” while helping frame research questions, trace ideas to evidence, and clarify what’s established or still unresolved in a field.
AstaBench is the second part of the release. It is an evaluation framework and set of benchmarks that test agents across four categories: literature understanding, code and execution, data analysis, and end-to-end discovery. The initial suite includes more than 2,400 problems organized into 11 benchmarks. Ai2 says this benchmark suite will help scientists identify which agents best support their needs through task-relevant leaderboards, while giving AI developers a standard execution environment and standard tools to test the scientific reasoning capabilities of their agents. These results compare to a large integrated collection of well-known baselines from the literature, the company says, including both open and closed LLM foundation models and agents.
As a third piece of this new ecosystem, Ai2 also released Asta Resources for developers, a complete environment for AI developers to build, test, and refine trustworthy scientific AI agents. Asta Resources provides first-party Asta and baseline agents, methods for searching and navigating the scientific literature, and agent tools that are integrated with AstaBench. The set includes the Scientific Corpus Tool, an MCP extension of the Semantic Scholar API, that gives agents free access to a normalized index of more than 200 million papers, which serves about 1.5 billion queries each year, Ai2 claims. Designed for seamless use with agents via MCP, it supports both sparse and dense full-text semantic search across open access papers and adds functions for common graph-based discovery strategies, such as starting from one paper to find newer work that cites it or locating other papers by the same authors. The library also includes open source agents and open language models that are post trained for science.

(Source: Ai2)
Ai2 plans to add more skills to Asta as each reaches a clear bar for accuracy and explanation. The goal is a single interface that brings advanced capability into one place while keeping researchers focused on science. Planned features include experiment replication, where the agent can duplicate computational studies described in papers by locating code repositories and data and loading the right packages. Another is hypothesis generation that proposes and refines testable questions grounded in prior evidence and flags promising leads. A third area is scientific programming, with code to clean data, run simple machine learning, and execute simulations.
The Allen Institute for AI is a Seattle nonprofit founded in 2014 by the late Microsoft co-founder Paul G. Allen. The organization develops open models (e.g., OLMo, Tulu), datasets, and tools, and operates the Semantic Scholar research platform used across the scientific community. The latest release, Asta, provides a practical scientific assistant and brings workflows, tests, and code into one open environment. If the new benchmark suite and resources stand up in outside labs, they could give research teams a shared way to judge agent performance, trace sources, and repeat results. For AI for science, platforms like these could offer a path to using agents in fields where method and provenance matter the most.