Image Source: ChatGPT-4o

OpenAI Launches Codex, a Cloud-Based AI Coding Agent for Teams

OpenAI has launched a research preview of Codex, a cloud-based software engineering agent designed to take on real development tasks—including writing features, fixing bugs, answering questions about the code, running tests, and proposing pull requests. Starting today, Codex is available to ChatGPT Pro, Team, and Enterprise users, with support for Plus and Edu accounts coming soon.

Codex is powered by codex-1, a variant of OpenAI’s o3 model optimized for real-world coding. Codex-1 was trained using reinforcement learning on real-world software tasks across different environments, enabling it to generate code that reflects human coding style, follows instructions precisely, and runs tests iteratively until they pass. It runs each task in a secure, isolated cloud environment preloaded with the user’s codebase, allowing developers to assign multiple tasks in parallel and monitor their progress in real time.

What Codex Can Do

Codex is designed to function like a junior developer working on background tasks. Users can ask Codex to:

Add new features
Answer questions about their codebase
Propose or revise pull requests
Fix bugs and run test suites
Refactor or rename components
Draft documentation or clean up code

Each task is launched from the Codex sidebar in ChatGPT. Developers submit prompts via “Code” or “Ask,” and Codex executes the task inside a sandboxed environment. Codex can read and write files, run tests, execute terminal commands, and provide verifiable logs for all actions.

Task times vary from 1 to 30 minutes depending on complexity, and users can monitor progress live. Once complete, Codex commits its changes in the environment, offering logs, test outputs, and citations to support its actions. Users can then review the results, request revisions, open a GitHub pull request, or integrate the changes directly into their local environment.

How It Works

Codex runs in isolated cloud containers with no internet access, meaning it only interacts with the developer’s codebase and preinstalled dependencies, and cannot access external websites, APIs, or other services. Developers can configure the Codex environment to closely match their real development setup, making its behavior more consistent with how their code runs in practice.

Agents can also be guided using AGENTS.md files—similar to README files—which instruct Codex on how to navigate the codebase, run tests, and follow your team's best practices. While Codex-1 performs well without custom setup, OpenAI notes that results improve when agents are given well-configured development environments, reliable testing setups, and clear documentation—conditions that help Codex navigate projects more effectively and produce higher-quality output.

On standardized coding evaluations and OpenAI’s internal software engineering benchmarks, Codex-1 shows strong performance—even without custom setup or AGENTS.md files. The charts below illustrate how it compares to other models across both external and internal tasks.

Codex-1 Benchmark Performance on SWE-Bench and Internal Engineering Tasks. Image Source: OpenAI

Trust, Safety, and Alignment

Codex is being released as a research preview, part of OpenAI’s strategy of iterative deployment. To support transparency and safety, each agent task includes verifiable outputs—such as terminal logs, test results, and citations—allowing users to trace what actions were taken and why. When tests fail or the agent is uncertain, Codex explicitly communicates the issue rather than proceeding silently. This design is meant to help users make informed decisions, but OpenAI emphasizes that all code generated by Codex should still be manually reviewed and validated before integration or execution.

OpenAI also trained Codex-1 to align closely with human development standards. Compared to its base model, o3, it consistently produces cleaner code patches that are easier to review and integrate into workflows.

On the security side, Codex was trained to refuse requests that appear intended for malware development or abusive use. At the same time, OpenAI emphasized that safeguards were designed to block malicious use without hindering legitimate applications—including complex tasks like low-level kernel engineering, which can resemble techniques used in malware development.

OpenAI has updated its policy frameworks and conducted additional safety evaluations to reinforce these safeguards. The results are detailed in a newly published addendum to the o3 System Card, which outlines how Codex handles sensitive requests and balances safety with legitimate developer use.

Early Testing and Adoption

OpenAI engineers have already begun using Codex internally for routine development tasks such as refactoring, renaming, writing tests, fixing small bugs, and responding to on-call issues. It’s particularly useful for reducing context-switching, surfacing forgotten to-dos, and keeping teams in flow by handling background work that might otherwise interrupt more complex coding efforts.

The company also worked with a small group of external testers. Examples include:

Cisco is exploring how Codex can support their engineering teams across a broad product portfolio. As an early design partner, Cisco is helping shape the tool’s future by testing real-world applications and providing direct feedback to OpenAI.
Temporal uses Codex to accelerate feature development, debug complex issues, write and run tests, and refactor large-scale codebases. By running tasks in the background, Codex helps engineers stay focused and maintain development velocity without constant interruptions.
Superhuman relies on Codex to streamline small but repetitive engineering tasks—like improving test coverage and resolving integration failures. It also enables product managers to make lightweight code changes independently, reducing handoff delays and keeping teams moving faster.
Kodiak Robotics uses Codex to write internal debugging tools, boost test coverage, and manage refactors—all of which support the ongoing development of the Kodiak Driver, their autonomous driving technology. Codex also acts as a reference assistant, helping engineers understand unfamiliar parts of the codebase by surfacing relevant history and context.

Based on early feedback, OpenAI recommends assigning well-scoped tasks to multiple agents in parallel and experimenting with different tasks and prompts to explore Codex’s strengths.

Codex CLI and Model Updates

In addition to the ChatGPT-based agent, OpenAI is updating Codex CLI, a lightweight coding agent that runs locally in a developer’s terminal. It brings models like o3 and o4-mini into your local development workflow, allowing you to collaborate with them directly in your terminal to complete tasks more efficiently.

Today’s update includes codex-mini-latest, a smaller version of codex-1 based on o4-mini and specifically designed for use in Codex CLI. This model is optimized for fast, responsive performance in local development workflows—supporting low-latency code editing, real-time Q&A, and streamlined instruction following. While lighter than codex-1, it retains the same strengths in stylistic consistency and task alignment. Codex-mini-latest is now the default model in Codex CLI and is also available via the API, with regular snapshot updates planned as the model continues to improve.

Codex CLI now supports sign-in using your existing ChatGPT account, eliminating the need to manually generate and configure API tokens. After signing in, you can select the API organization you want to use, and Codex CLI will automatically generate and apply the correct API key—streamlining setup for developers.

Pricing and Availability

Codex in ChatGPT is now available to Pro, Team, and Enterprise users. Support for Plus and Edu accounts is coming soon.
The research preview offers generous free usage to all eligible users during the research preview period. Longer term, OpenAI will introduce rate-limited free access, with flexible pricing for additional usage.
For developers using codex-mini-latest via API, pricing is $1.50 per million input tokens and $6 per million output tokens, with a 75% prompt caching discount.
Plus and Pro users who sign in to Codex CLI with their ChatGPT credentials can redeem free API credits for the next 30 days: $5 for Plus users, and $50 for Pro users.

Limitations and Future Direction

As a research preview, Codex currently lacks support for some features—such as image inputs for frontend development and the ability to intervene mid-task to course-correct the agent. Delegating work to a remote agent also takes longer than traditional, interactive editing, which may require some adjustment for developers used to immediate feedback.

Looking ahead, OpenAI expects Codex to evolve into a more flexible, collaborative agent—capable of handling more complex tasks over longer durations, and interacting more like a teammate in asynchronous workflows.

Looking Ahead

OpenAI plans to expand Codex from a single-agent tool into a broader suite of AI workflows that support both real-time collaboration and longer-term delegation. The company sees this as part of a deeper shift in how software is developed: moving from moment-to-moment code generation to asynchronous, multi-agent teamwork.

Codex is already enabling this new mode inside ChatGPT. OpenAI believes this asynchronous, multi-agent workflow will become the default way engineers produce high-quality code—freeing them to focus on strategic or creative tasks while background agents handle well-scoped implementation work.

To support that future, OpenAI plans to introduce:

Mid-task guidance, allowing developers to steer agents while they're working
Proactive progress updates, so users can stay informed without checking in manually
Deeper tool integrations, including task assignment directly from ChatGPT Desktop, Codex CLI, GitHub, issue trackers, and CI systems

Ultimately, OpenAI sees these two modes of interaction—real-time pairing and asynchronous task delegation—converging. The company envisions a future where developers collaborate with AI agents across their IDEs and everyday tools to ask questions, get suggestions, and offload longer tasks, all within a unified workflow.

“Software engineering is one of the first industries to experience significant AI-driven productivity gains,” the announcement notes. “This is just the beginning—and we’re excited to see what you build with Codex.”

What This Means

Codex is a step toward a new model of software development—one where AI agents act as collaborative workers, not just autocomplete tools. By turning natural language prompts into fully executed coding tasks, Codex moves beyond code generation and into software execution.

This shift could reshape how engineering teams work, especially as agents become more capable, auditable, and integrated with real-world codebases. It opens the door for smaller teams to build more, for non-developers to contribute lightweight changes, and for companies to delegate background engineering work to AI.

At the same time, it raises important questions about developer oversight, skill development, and the long-term impact of AI delegation. Codex is still early—but it offers a clear look at where AI-assisted programming is headed: more asynchronous, more scalable, and more deeply embedded in day-to-day workflows.

With Codex, AI is no longer just a coding assistant—it’s becoming part of the team.

Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.

OpenAI Launches Codex, a Cloud-Based AI Coding Agent for Teams

OpenAI Launches Codex, a Cloud-Based AI Coding Agent for Teams

Keep Reading

AiNews.com