OpenAI launched Codex, a cloud-based software engineering agent that can write code based on textual prompts in natural language. It can work on multiple tasks at the same time. This agent works on code provided through the GitHub repositories and a pre-installed setup by the user, without needing to be connected to the internet. Therefore, access to external websites and APIs is denied.
It is a follow-up version of Codex CLI, launched in April 2025. It is an open-source coding agent that runs in the user’s terminal, i.e., command-line interface. Like Codex, it also didn’t need to be connected to a browser or an Integrated Development Environment (IDE) like Android Studio and Visual Studio Code (VS Code).
What can the Codex AI Agent do?
The AI agent released by OpenAI can perform tasks like fixing bugs, writing specific features, answering questions about your codebase, and proposing pull requests for review. It can read and edit files and run commands, including test harnesses, linters (tools to spot programming errors), and type checkers.
Trained using reinforcement learning on real-world coding tasks in various environments to generate code that closely mirrors human style and PR preferences, adheres precisely to instructions, and can iteratively run tests until it receives a passing result.
It currently lacks features like image inputs for frontend work and the ability to course-correct the AI agent while it’s working. Additionally, delegating to a remote agent takes longer than interactive editing, which can take some getting used to. Over time, interacting with Codex agents will increasingly resemble asynchronous collaboration with colleagues.
How does it function?
Codex can be guided through AGENT.md files. These markdown-based text files are used to inform the Codex AI regarding how to run commands and navigate the codebase (collection of human-written source code). Each agent runs in its cloud container with no internet access. After setup, internet access is disabled, and the model trajectory begins. This requires the users to preload the code and a development environment that the programmer defines.
Codex is also trained to provide verifiable evidence of its actions through terminal logs and files. This is to validate the model’s work if further refinements are needed.
The Trends of Vibe Coding
Andrei Karpathy coined the term vibe coding. He describes it as “where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.” Vibe coding is a method of making LLMs and AI agents write code using input from natural language.
Cursor, Replit’s Agent, StackBlitz’s Bolt and the recent OpenAI’s Codex are a few companies that offer vibe coding features to the users.
Writing code with AI agents’ help is being adopted worldwide. OpenAI says that Cisco, Temporal, Superhuman and Kodiak are already using the Codex for debugging, testing, automation, etc. However, in recent Microsoft layoffs, software developers and coders were the most brutally hit. More than 40% of the 2,000 positions cut were related to coding, as per the documents reviewed by Bloomberg.
Can Codex be abused through Vibe Coding?
Yes. However, OpenAI claims that the Codex was trained to identify and reduce requests if the user tries creating malicious software.
Advertisements
However, OpenAI claims that autonomous coding agents can be exploited to develop malware, including low-level kernel engineering. The kernel is the brain of an operating system (OS) that can control the hardware and software.
Low-level kernel engineering refers to the manipulation of the OS. For example, the user can create an invisible API malware that can interact with the OS to execute unauthorised actions. It is noted that kernel-level operations are not visible to most user-level monitoring tools like anti-virus alerts, system logs, or simply task managers that display the list of running applications.
Why it Matters:
Vibe coding, or AI agents that write code, has significantly lowered barriers to entry. However, the need for greater vigilance to maintain code quality has also increased.
“Vibe coding is all fun and games until you have to vibe debug,” said Ben South, who is building Variant. At times, AI exhibits “AI paternalism,” which decides what is best for the user instead of directly answering the query.
Such moderations are especially important when self-harm or harm to others is involved. For example, if a user asks an AI model for ways to commit suicide, it can respond with mental health resources instead of providing harmful information. However, this approach can sometimes be excessive. In March, Cursor’s AI tool refused to write code for a user, advising him to “develop the logic” to understand system maintenance better. The chatbot reasoned that “generating code for others can lead to dependency and reduced learning opportunities.”
This highlights the autonomous nature of AI agents and the reliance of users and companies on them to replace human oversight. The potential for mass layoffs requires caution and raises questions about whether we can operate AI coding agents in autopilot mode.
Also Read:
Support our journalism: