← Back
Cloudflare
Cloudflare Agents SDK v0.7.0 adds observability overhaul, heartbeat keepAlive, and MCP connection waiting

Observability Rewrite

The observability system has been completely rebuilt using Node.js diagnostics channels, replacing the previous console.log-based approach. Events are now structured and silent by default with zero overhead when nobody is listening. All events contain a type, payload, and timestamp and are routed to seven named channels:

  • agents:state - state updates
  • agents:rpc - RPC calls and errors
  • agents:message - message lifecycle, tool results, and approvals
  • agents:schedule - scheduling and queue operations
  • agents:lifecycle - connect/destroy events
  • agents:workflow - workflow state transitions
  • agents:mcp - MCP client connection events

Use the typed subscribe() helper from agents/observability for type-safe event access. In production, diagnostics channel messages are automatically forwarded to Tail Workers without needing subscription code in the agent itself.

Keeping Agents Alive During Long Operations

Two new methods prevent Durable Object eviction during long-running operations:

  • keepAlive() - Creates a 30-second heartbeat schedule that resets the inactivity timer, returning a disposer function to cancel when done
  • keepAliveWhile() - Wraps async functions with automatic heartbeat management, starting before the function runs and stopping when it completes

Multiple concurrent callers are supported with independent disposers. AIChatAgent automatically calls keepAlive() during streaming responses, so you don't need to add it manually. The heartbeat uses the scheduling system and appears in getSchedules() if needed.

MCP Connection Waiting

AIChatAgent now waits for MCP server connections to settle before calling onChatMessage, ensuring this.mcp.getAITools() returns the full set of tools—especially important after Durable Object hibernation when connections are being restored in the background. Configure the behavior via the waitForMcpConnections property:

  • { timeout: 10_000 } (default) - Wait up to 10 seconds
  • true - Wait indefinitely
  • false - Disable waiting

This prevents tool availability race conditions in production deployments.