OpenAI has released two AI “reasoning” models that it says are its most capable yet as well as an open-source AI agent that helps computer programmers code, as the company seeks to gain a lead over its rivals.
The open-source coding agent, called Codex CLI, marks the first time since 2019 that OpenAI has introduced a significant open-source tool.
The other new models are the full-scale version of its o3 model, which OpenAI says is its most advanced AI system, as well as a smaller, but more efficient model called o4-mini.
“These are the first models where top scientists tell us they produce legitimately good and useful novel ideas,” OpenAI president Greg Brockman said in announcing the new products on Wednesday.
The models will be immediately available to users of its paid ChatGPT Plus and Pro services, as well as organizations that use its enterprise-focused Teams and API products.
The release of the new models comes at a time when OpenAI faces pressure to show it remains at the forefront of AI development. Earlier this year, China’s DeepSeek upended conventional wisdom about the technological edge U.S. AI labs such as OpenAI enjoyed for years. DeepSeek’s R1 mimicked the “chain of thought” reasoning that OpenAI’s o-series models offer. The fact that DeepSeek’s R1 was also an open model—meaning people could download it for free and customize it easily—has tilted many enterprises in favor of deploying such open-source models. Most of OpenAI’s models, in contrast, can only be accessed on a paid basis through a proprietary application programming interface (API).
At the same time, OpenAI has also faced increased competition from other proprietary model providers. In February, AI company Anthropic became the first to offer a model that combines quick, intuition-like answers with the ability to also perform “chain of thought” step-by-step reasoning if a prompt requires it. The ability to decide when reasoning is required and when a faster answer will do is a trick OpenAI has yet to match. Then, last month, Google unveiled its Gemini 2.5 Pro model, a reasoning model that beat OpenAI’s o3-mini model on numerous benchmarks.
On Wednesday, OpenAI moved to try to retake the lead in reasoning models. The company says its o3 and o4-mini models now top various benchmarks—although none of those results has yet been independently verified. It also says the models have the ability to autonomously use other software tools, such as web browsing and coding environments, without having to be specifically prompted to do so by a user.
In a demo of o3’s capabilities that OpenAI livestreamed Wednesday, AI researchers showed o3 analyzing a photo of a physics research poster from 2015 and then searching the web autonomously to find more recent relevant research and comparing the results. They also showed it autonomously deciding to run Python code to solve various math and coding challenges.
OpenAI said o3 and o4-mini have the ability to reason directly about visual information, such as sketches, diagrams, or photos—even ones that might be blurry or of poor quality. The company said the models also knew how to manipulate photos as part of their reasoning process.
Meanwhile, the new Codex CLI coding agent is designed to run on a user’s device, tapping a cloud-based connection to OpenAI’s o3 and o4-mini models to help it reason, but then also allowing it to use other software tools deployed locally. Codex CLI doesn’t just suggest lines of code, it can autonomously decide to use a variety of different tools to help it complete a task.
The company said Codex CLI would also soon be able to tap the capabilities of the GPT-4.1 model that it released earlier this week.
To encourage developers to experiment with Codex CLI, OpenAI said it had set up a $1 million fund that will disburse $25,000 grants in API credits to promising projects.
OpenAI said o3 used about 10 times as much computing power to train as it took to create its o1 model, its previous best reasoning model.
This story was originally featured on Fortune.com