LangChain adds autonomous context compression to Deep Agents SDK

What's New

LangChain has released an autonomous context compression tool for the Deep Agents SDK (Python) and CLI. This feature allows AI agents to automatically decide when to compress their context windows—replacing older messages with summaries—rather than relying on fixed token thresholds or manual user commands.

The Problem

Traditional agent harnesses use static, hand-tuned rules to compress context at fixed thresholds (Deep Agents compacts at 85% of a model's context limit). This approach is suboptimal because:

Timing matters: Compressing mid-refactor disrupts workflows, while compressing at task boundaries is cleaner and more effective
User awareness: Users must manually trigger compression with commands like /compact, adding cognitive overhead
Rigid thresholds: Fixed token limits don't account for when context is actually becoming irrelevant

When Agents Trigger Compression

The tool guides agents to compress at opportune moments:

Task boundaries: Starting a new task where prior context is irrelevant
After extraction: Completing research or summarization tasks that consumed significant context
Before large inputs: Preparing to read or generate substantial new content
Before complex processes: Starting refactors, migrations, or multi-step executions
After decisions: When new requirements invalidate prior context or resolve tangents

How It Works

The feature is implemented as middleware that preserves 10% of available context (recent messages) and summarizes everything before it. Deep Agents retains full conversation history in its virtual filesystem, allowing context recovery post-summarization. Testing shows agents are conservative about triggering compression but choose strategically beneficial moments when they do.

Getting Started

In the SDK:

from deepagents import create_deep_agent
from deepagents.middleware.summarization import create_summarization_tool_middleware

agent = create_deep_agent(
    model="openai:gpt-5.4",
    middleware=[create_summarization_tool_middleware(model, backend)],
)

In the CLI: Simply call /compact when ready to trim context or move to a new task.

This release reflects LangChain's broader philosophy of giving models more control over their working memory rather than relying on rigid, hand-tuned harness rules.

What's New

The Problem

When Agents Trigger Compression

How It Works

Getting Started

Tags

Published

Source