Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

ByteDance Launches New Educational Product to Support Online Learning_the_model_Doubao

AI ALERT: Levi & Korsinsky Files Securities Fraud Class Action Against C3.ai, Inc. – October 21, 2025 Deadline

RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes – Takara TLDR

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
TechCrunch AI

OpenAI launches program to design new ‘domain-specific’ AI benchmarks

By Advanced AI EditorApril 9, 2025No Comments2 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


OpenAI, like many AI labs, thinks AI benchmarks are broken. It says it wants to fix them through a new program.

Called the OpenAI Pioneers Program, the program will focus on creating evaluations for AI models that “set the bar for what good looks like,” as OpenAI phrased it in a blog post.

“As the pace of AI adoption accelerates across industries, there is a need to understand and improve its impact in the world,” the company continued in its post. “Creating domain-specific evals are one way to better reflect real-world use cases, helping teams assess model performance in practical, high-stakes environments.”

As the recent controversy with the crowdsourced benchmark LM Arena and Meta’s Maverick model illustrate, it’s tough to know, these days, precisely what differentiates one model from another. Many widely-used AI benchmarks measure performance on esoteric tasks, like solving doctorate-level math problems. Others can be gamed, or don’t align well with most people’s preferences.

Through the Pioneers Program, OpenAI hopes to create benchmarks for specific domains like legal, finance, insurance, healthcare, and accounting. The lab says that, in the coming months, it’ll work with “multiple companies” to design tailored benchmarks and eventually share those benchmarks publicly, along with “industry-specific” evaluations.

“The first cohort will focus on startups who will help lay the foundations of the OpenAI Pioneers Program,” OpenAI wrote in the blog post. “We’re selecting a handful of startups for this initial cohort, each working on high-value, applied use cases where AI can drive real-world impact.”

Companies in the program will also have the opportunity to work with OpenAI’s team to create model improvements via reinforcement fine tuning, a technique that optimizes models for a narrow set of tasks, OpenAI says.

The big question is whether the AI community will embrace benchmarks whose creation was funded by OpenAI. OpenAI has supported benchmarking efforts financially before, and designed its own evaluations. But partnering with customers to release AI tests may be seen as an ethical bridge too far.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleAnother Hit Piece on Open-Source AI
Next Article Autonomous Driving Startup Nuro Raises $106M At $6B Valuation
Advanced AI Editor
  • Website

Related Posts

The billion-dollar infrastructure deals powering the AI boom

September 22, 2025

Nvidia plans to invest up to $100B in OpenAI

September 22, 2025

5 days left to save up to $668 on your TechCrunch Disrupt 2025 pass

September 22, 2025
Leave A Reply

Latest Posts

St. Patrick’s Cathedral Unveils Monumental Mural by Adam Cvijanovic

Three Loaned Banksy Works Incite Dispute Between England and Italy

Major Collection of Old Masters Paintings Could Be Fractionalized

New Collectors Drive Strong Sales at New York Fair

Latest Posts

ByteDance Launches New Educational Product to Support Online Learning_the_model_Doubao

September 22, 2025

AI ALERT: Levi & Korsinsky Files Securities Fraud Class Action Against C3.ai, Inc. – October 21, 2025 Deadline

September 22, 2025

RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes – Takara TLDR

September 22, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • ByteDance Launches New Educational Product to Support Online Learning_the_model_Doubao
  • AI ALERT: Levi & Korsinsky Files Securities Fraud Class Action Against C3.ai, Inc. – October 21, 2025 Deadline
  • RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes – Takara TLDR
  • Nvidia to invest $100 billion in OpenAI to help expand computing power
  • MIT Affiliates Secure AI Grants for Math Discovery

Recent Comments

  1. JeffreyDaf on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. Danielarive on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. Michaelvox on Curiosity, Grit Matter More Than Ph.D to Work at OpenAI: ChatGPT Boss
  4. vagina on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. OmeglePlayer on Veo 3 demo | Duck interrogation

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.