AiNews.com
Posts
OpenAI Expands Responses API with Tools for Smarter, Faster Agents

OpenAI Expands Responses API with Tools for Smarter, Faster Agents

Alicia Shapiro
May 23, 2025 • Estimated Reading Time: 7 minutes

A software developer with short dark brown hair and black-framed glasses is seated at a wooden desk in a modern, well-lit office with large windows. He is working on a laptop, which is connected to a large external monitor. The monitor displays OpenAI’s Responses API interface, prominently featuring icons and labels for "Image Generation," "Code Interpreter," and "File Search." Each tool is shown within a clean, user-friendly interface. The developer’s laptop also shows lines of Python code, suggesting active development work. The background includes contemporary office furniture and potted plants, contributing to a productive and professional atmosphere.

Image Source: ChatGPT-4o

OpenAI Expands Responses API with Tools for Smarter, Faster Agents

OpenAI has expanded its Responses API—used by developers to build applications powered by large language models (LLMs)—with a slate of new tools and improvements aimed at boosting capability, performance, and enterprise reliability.

The updates include support for remote MCP (Model Context Protocol) servers, the integration of tools like image generation and Code Interpreter, and enhancements to file search. These additions are available across OpenAI’s latest model families, including GPT‑4o, GPT‑4.1, and the o-series reasoning models like o3 and o4-mini.

More Capable Agents, Built-in

The Responses API now allows models to call tools directly as part of their reasoning process. This gives developers a way to build smarter, more context-aware agents that maintain continuity across requests and tool calls. According to OpenAI, this setup not only improves performance and accuracy but also reduces latency and cost for developers using models like o3 and o4-mini.

New Tools Now Available:

Image generation: Developers can access OpenAI’s latest image model, gpt-image-1, directly within the Responses API. This tool supports real-time streaming previews and step-by-step refinement through multi-turn edits. Learn more in OpenAI's Docs.
Code Interpreter: Integrated for applications involving data analysis, complex math and coding problems, and even visual reasoning ("thinking with images"), this tool boosts performance when used with OpenAI’s o3 and o4-mini reasoning models—showing gains across several benchmarks, including Humanity’s Last Exam. Learn more in OpenAI's Docs.
File search: An upgraded version of file search allows users to pull relevant document snippets into a model’s context. Updates now allow file search to query across multiple vector stores, making it easier to manage distributed document sources. It also supports more advanced filtering using arrays of attributes, enabling more precise control over what content is retrieved. Learn more in OpenAI's Docs.

Support for Remote MCP Servers

OpenAI is also enabling support for remote MCP servers, building on the protocol’s earlier introduction in its Agents SDK. MCP is an open standard that helps applications supply context to LLMs in a structured way.

With this new integration, developers can link OpenAI models to external tools and data sources hosted on any MCP-compliant server using minimal code. For example:

A Shopify integration can automatically add products to a user’s cart.
A Stripe setup can generate custom payment links based on real-time usage data.
A Twilio integration can retrieve information, summarize it, and send the result via SMS to a user’s phone.

Popular services already supporting remote MCP connections include Cloudflare, HubSpot, Intercom, PayPal, Plaid, Stripe, Shopify, Stripe, Square, Twilio, and Zapier. OpenAI has also joined the MCP steering committee to help shape the protocol’s future.

To get started with your own remote MCP server, see Cloudflare’s setup guide. For instructions on how to use the MCP tool with the Responses API, visit OpenAI’s API Cookbook.

Enterprise-Ready Features

Several new features have been added to support the needs of enterprise customers and developers managing complex or sensitive workloads:

Background mode: Enables asynchronous processing for long-running tasks, reducing the risk of timeouts or disconnects. Developers can either poll background tasks to check their status or stream events as needed to keep their applications up to date with the latest progress. Learn more in OpenAI's Docs.
Reasoning summaries: Offers concise natural-language summaries of the model’s internal reasoning path, improving transparency and debuggability. Reasoning summaries are included at no extra charge. Learn more in OpenAI's Docs.
Encrypted reasoning items: Supports reuse of reasoning steps across requests without storing data on OpenAI’s servers, ideal for Zero Data Retention (ZDR) customers. For models such as o3 and o4-mini, reusing reasoning items across function calls enhances intelligence, cuts down token usage, and improves cache efficiency—leading to lower costs and reduced latency. Learn more in OpenAI's Docs.

Pricing and Availability

All new tools and features are now live in the Responses API, compatible with GPT‑4o, GPT‑4.1, and OpenAI’s o-series models. Image generation is supported exclusively on the o3 model in our reasoning series. Pricing details include:

Image generation:

$5.00 per million text input tokens
$10.00 per million image input tokens
$40.00 per million image output tokens
75% discount for cached input tokens

Code Interpreter: $0.03 per container

File search: $0.10 per GB per day for vector storage and $2.50 per 1,000 tool calls

Remote MCP server: No added cost; output tokens are billed as usual

Learn more about the pricing models in OpenAI's Docs.

What This Means

These updates mark a turning point for developers building AI-driven applications. The Responses API is no longer just a way to get text output—it’s becoming a comprehensive platform for building intelligent, multi-modal agents that can reason, act, and adapt in real time.

With built-in access to tools like image generation, code execution, and document search—plus support for external MCP servers—developers can now create agents that seamlessly pull in context, analyze data, and even trigger actions across third-party services like Stripe or Twilio. This drastically reduces the complexity of integrating LLMs into real-world workflows.

For enterprises, the addition of features like encrypted reasoning items and background processing offers the control, security, and scalability needed to move from experimentation to production. These capabilities make it easier to maintain privacy, handle long tasks reliably, and debug agent behavior with greater transparency.

Ultimately, OpenAI’s expanded Responses API is shaping a new class of applications—ones that go beyond chat and become intelligent co-pilots across business, creative, and operational domains.

Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.