OpenAI adds shell execution and hosted container workspace to Responses API for agent workflows

New Agent Execution Environment

OpenAI has expanded the Responses API to include a shell tool and hosted container workspace, enabling developers to build agents that execute complex workflows. Unlike previous approaches that limited models to pure inference, agents can now propose and execute shell commands in an isolated compute environment, handling tasks like running services, querying APIs, and generating outputs like spreadsheets or reports.

How It Works

The shell tool allows models to interact with a computer through familiar Unix utilities (grep, curl, awk, etc.). The Responses API orchestrates a tight execution loop: the model proposes actions (like reading files or fetching data), the platform executes them in the container, and results feed back to the model for the next step. Models GPT-5.2 and later are trained to propose shell commands natively.

Key features include:

Concurrent execution: Models can propose multiple shell commands that execute in parallel, with the Responses API multiplexing streams back into structured context
Bounded output: Large command outputs are capped with preserved beginning and end text to manage context window usage
Isolated environment: Commands run in a restricted container with a filesystem, optional structured storage (SQLite), and controlled network access
Streaming results: Shell output flows back to the model in near real-time for dynamic decision-making

Context Management for Long-Running Tasks

The service includes context compaction to handle long-running agent workflows that would otherwise fill the limited context window. Rather than requiring developers to build custom summarization logic, the platform automatically manages context across multiple turns while preserving key details.

Practical Benefits

This removes infrastructure burden from developers: no more building custom execution environments, managing file storage between steps, pasting large tables into prompts, or handling timeouts and retries manually. The platform provides a complete agent execution harness within the Responses API.

New Agent Execution Environment

How It Works

Context Management for Long-Running Tasks

Practical Benefits

Products

Tags

Published

Source

Related News