OpenAI Is Pushing For Industry-specific AI Benchmarks - Why That Matters

OpenAI homescreen — Getty Images/NurPhoto/Contributor

Benchmark performance results typically accompany the launch of every new AI model to showcase how well the models can perform on various tasks. However, these tasks are not catered to individual industries but are more general, such as grade school mathematics (GSM8K) or graduate-level reasoning (GPQA).

Also: ChatGPT will remember everything you tell it now – like a real personal assistant

OpenAI Pioneers Program

To fill that gap, OpenAI launched the OpenAI Pioneers Program, intended to advance AI model development for specific industries and real-world use cases. The program is a two-pronged effort in which companies will collaborate with OpenAI researchers to develop more domain-specific evaluations and fine-tuned models.

we’re launching the openai pioneers program — a partnership between openai and companies building advanced ai products to (a) intensively fine-tune models that outperform at high value domain-specific tasks, and (b) build better real world evals that enable industries to better… https://t.co/cCvkGmYqJd

— Brad Lightcap (@bradlightcap) April 9, 2025

In the blog post, OpenAI shared that “industries like legal, finance, insurance, healthcare, accounting, and many others are missing a unified source of truth for model benchmarking.” As a result, OpenAI will now work with multiple companies across each industry to develop those evaluations, which are aimed not only at developing models but also at building better trust between the public and these systems.

Also: AI isn’t hitting a wall, it’s just getting too smart for benchmarks, says Anthropic

Research has highlighted this void of benchmarks as a major gap in AI for enterprise use cases. For example, Silvio Savarese, head of Salesforce AI Research, released a blog post on Enterprise General Intelligence (EGI), a concept he is pioneering that refers to more advanced AI solutions tailored to businesses’ domain-specific needs. In a conversation with ZDNET, he shared that one of the major steps needed to reach EGI is benchmarks that look at evaluating domain-specific functions.

Refining existing models

Beyond evaluations, OpenAI will also collaborate with the team to refine existing models for three industry-specific use cases using a technique known as reinforcement fine-tuning (RFT). The OpenAI team will help guide the companies on how to use RFT, and then the companies can decide how to deploy the models, which should be ready for large-scale deployment, according to OpenAI.

Also: The AI model race has suddenly gotten a lot closer, say Stanford scholars

The first cohort will consist of a handful of startups working on use cases that can “drive real-world impact.” If your company fits these criteria, you can apply by filling out the form with basic information about the company on the OpenAI Pioneers Program webpage.

Get the morning’s top stories in your inbox each day with our Tech Today newsletter.

Source link

What's Hot

China reportedly discouraged purchase of NVIDIA AI chips due to ‘insulting’ Lutnick statements

Tool-space interference in the MCP era: Designing for agent compatibility at scale

Peak bubble – by Gary Marcus

OpenAI is pushing for industry-specific AI benchmarks – why that matters

OpenAI Execs Are Extremely Upset

OpenAI Hopes Animated ‘Critterz’ Will Prove AI Is Ready for the Big Screen

OpenAI to spend $300 billion on Oracle cloud over five years: Report

1 Comment

National Gallery and Tate Have ‘Bad Blood’—and More Art News

Christie’s Will Auction The First Calculating Machine In History

The Art Market Isn’t Dying. The Way We Write About It Might Be.

Banksy Mural of Judge Beating Protestor Removed by Courts Service