New Capabilities for the Responses API
OpenAI is shifting the paradigm from isolated language models to agents capable of executing complex workflows. The Responses API now includes a shell tool integrated with a hosted container workspace, allowing models to move beyond text generation and actually perform actions: running services, fetching data from APIs, generating spreadsheets, and managing files.
How the Shell Tool Works
The shell tool functions as a native integration in the Responses API, enabling models to propose Unix shell commands that execute in an isolated container environment. Built on familiar command-line utilities like grep, curl, and awk, the shell tool supports any executable available in a standard Linux environment—including Go, Java, Node.js, and Python applications. Unlike the existing code interpreter, which is limited to Python, this broader approach unlocks use cases like running compiled programs or starting network services.
The agent loop operates iteratively:
- Model receives prompt and tool instructions
- Model decides to execute shell commands (requires GPT-5.2 or later)
- Responses API forwards commands to the container runtime
- Output streams back to the API in near real-time
- Model inspects results and determines next action or final response
- Loop continues until task completion
Production-Grade Features
Concurrent Execution: The model can propose multiple shell commands in a single step, and the Responses API executes them concurrently across separate container sessions, then multiplexes results back as structured context.
Bounded Output: To prevent terminal logs from consuming context budgets, developers can specify output limits per command. The API preserves the beginning and end of output while truncating the middle, enabling fast, context-efficient reasoning.
Context Compaction: Long-running tasks fill the context window across multiple turns. The Responses API includes native compaction designed to preserve key details while removing extraneous information, eliminating the need for custom summarization systems.
Isolated Container Workspace: Commands execute in an isolated environment with a filesystem, optional structured storage (SQLite), and restricted network access—addressing security concerns while enabling practical workflows.
Getting Started
GPT-5.2 and later models are trained to propose shell commands when the shell tool is available. Developers can access this through the Responses API with proper tool definitions in their prompts. The hosted container environment handles infrastructure concerns, allowing teams to focus on building reliable, repeatable production workflows.