Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Jus Mundi Launches Agentic Tool, Explains How It Works – Artificial Lawyer

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning – Takara TLDR

Build trustworthy AI agents with Amazon Bedrock AgentCore Observability

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Center for AI Safety

Submit Your Toughest Questions for Humanity’s Last Exam

By Advanced AI EditorApril 1, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


‍CAIS and Scale AI are excited to announce the launch of Humanity’s Last Exam, a project aimed at measuring how close we are to achieving expert-level AI systems. The exam is aimed at building the world’s most difficult public AI benchmark gathering experts across all fields. People who submit successful questions will be invited as coauthors on the paper for the dataset and have a chance to win money from a $500,000 prize pool.

Why Participate?

AI is developing at a rapid pace. Just a few years ago, AI systems performed no better than random chance on MMLU, the AI community’s most-downloaded benchmark (developed by CAIS). But just last week, OpenAI’s newest model performed around the ceiling on all of the most popular benchmarks, including MMLU, and received top scores on a variety of highly competitive STEM olympiads. Humanity must maintain a good understanding of the capabilities of AI systems. Existing tests now have become too easy and we can no longer track AI developments well, or how far they are from becoming expert-level.

Despite these advances, AI systems are still far from being able to answer difficult research and other intellectual questions. To keep track of how far the AI systems are from expert-level capabilities, we are developing Humanity’s Last Exam, which aims to be the world’s most difficult AI test.

Your Role

We’re assembling the largest, broadest coalition of experts in history to design questions that test how far AIs are from the human intelligence frontier. If there is a question that would genuinely impress you if an AI could solve it, we’d like to hear it from you!

If one or more of your questions is accepted, you will be offered optional co-authorship of the resulting paper. We have already received questions from researchers from MIT, UC Berkeley, Stanford, and more. The more questions accepted, the higher your name will appear.The top 50 questions will earn $5000 each.The next top 500 questions will earn $500 each.

Prizes may be awarded on question quality or question novelty compared to other questions. People who have already submitted questions prior to this announcement are also eligible for these prizes. A small set of questions will be kept private to catch if an AI is memorizing answers to public questions, but prizes can and co-authorship can be awarded to people who have their questions kept part of the private set.

Submission Guidelines

Challenge Level: Questions should be difficult for non-experts and not easily answerable via a quick online search. Avoid trick questions. Frontier AI systems are very good at answering even masters-level questions. It’s strongly encouraged that question-writers have 5+ years of experience in a technical industry job (e.g., SpaceX) or are a PhD student or above in academic training. In preparation for Humanity’s Last Exam, we found questions written by undergraduates tend to be too easy for the models. As a rule of thumb, if a randomly selected undergraduate can understand what is being asked, it is likely too easy for the frontier LLMs of today and tomorrow.Objectivity: Answers should be accepted by other experts in the field and free from personal taste, ambiguity, or subjectivity. Provide all necessary context and definitions within the question. Use standard, unambiguous jargon and notation.Originality: Questions must be your own work and not copied from others.Confidentiality: Questions and answers should not be publicly available. You may use questions from past exams you’ve given if they’re not accessible to the public.Weaponization Restrictions: Do not submit questions related to chemical, biological, radiological, nuclear, cyberweapons, or virology.

Terms and conditions here.

Deadline: November 1, 2024

For a detailed list of instructions and example questions, please visit agi.safe.ai/submit.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleFormer Google CEO suggests building data centers in remote locations in case of nation-state attacks to slow down AI
Next Article Qualcomm acquires generative AI division of Vietnamese startup VinAI
Advanced AI Editor
  • Website

Related Posts

Fear Of AGI Is Driving Harvard And MIT Students To Drop Out

August 16, 2025

AI experts warn of ‘risk of extinction’ similar to nuclear weapons

August 9, 2025

AI industry and researchers sign statement warning of ‘extinction’ risk

July 25, 2025
Leave A Reply

Latest Posts

Ralph Rugoff to Leave London’s Hayward Gallery After 20 Years

New York Foundation for the Arts Workers Move to Unionize

Patrizia Sandretto Re Rebaudengo Teams Up with New Museum

Growing Support for Parthenon Marbles’ Return to Greece, More Art News

Latest Posts

Jus Mundi Launches Agentic Tool, Explains How It Works – Artificial Lawyer

September 10, 2025

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning – Takara TLDR

September 10, 2025

Build trustworthy AI agents with Amazon Bedrock AgentCore Observability

September 10, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Jus Mundi Launches Agentic Tool, Explains How It Works – Artificial Lawyer
  • Parallel-R1: Towards Parallel Thinking via Reinforcement Learning – Takara TLDR
  • Build trustworthy AI agents with Amazon Bedrock AgentCore Observability
  • Tripo Unveils Tripo 3.0, Setting a New Standard in AI-Powered 3D Creation
  • Cisco advances agentic authority with Nvidia AI Factory solution

Recent Comments

  1. zanycricket2Nalay on Trump’s Tech Sanctions To Empower China, Betray America
  2. zanycricket2Nalay on TEFAF New York Illuminates Art Week With Mastery Of Vivid, Radiant Color
  3. fluffycuttlefish9Nalay on MIT’s Xstrings facilitates 3D printing parts with embedded actuation | VoxelMatters
  4. zanycricket2Nalay on Jony Ive is building a futuristic AI device and OpenAI may acquire it
  5. nuttyparrot4Nalay on MIT’s Xstrings facilitates 3D printing parts with embedded actuation | VoxelMatters

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.