Cloudflare adds GLM-4.7-Flash model and TanStack AI support to Workers AI

GLM-4.7-Flash: New Multilingual Model for Workers AI

Cloudflare has introduced GLM-4.7-Flash, a fast and efficient multilingual text generation model optimized for dialogue, instruction-following, and agent applications. The model features a 131,072 token context window, making it suitable for long-form content generation, complex reasoning, and processing lengthy documents.

Key capabilities include:

Multi-turn tool calling for building AI agents that invoke functions across conversation turns
Native multilingual support for content generation across multiple languages
Fast inference optimized for low-latency chatbots and virtual assistants
Excellent instruction-following for code generation and structured tasks

Access the model through Workers AI bindings (env.AI.run()), REST endpoints (/run or /v1/chat/completions), AI Gateway, or the Vercel AI SDK via the workers-ai-provider.

TanStack AI Integration

The new @cloudflare/tanstack-ai package brings Workers AI and AI Gateway support to TanStack AI, offering a framework-agnostic alternative for developers preferring TanStack's approach. The adapters support four configuration modes: plain binding, plain REST, AI Gateway binding, and AI Gateway REST.

Available adapters:

Chat (createWorkersAiChat) — Streaming completions with tool calling and structured output
Image generation (createWorkersAiImage) — Text-to-image models
Transcription (createWorkersAiTranscription) — Speech-to-text processing
Text-to-speech (createWorkersAiTts) — Audio generation
Summarization (createWorkersAiSummarize) — Text summarization

AI Gateway adapters also route requests from OpenAI, Anthropic, Gemini, Grok, and OpenRouter through Cloudflare's infrastructure for caching, rate limiting, and unified billing.

Enhanced Vercel AI SDK Support

The workers-ai-provider v3.1.1 release adds three new capabilities to the Vercel AI SDK integration: transcription (speech-to-text), text-to-speech (audio generation), and reranking (document reordering for RAG pipelines and search results).

The update also includes a comprehensive reliability overhaul addressing streaming, tool calling, and premature stream termination. Streams now process token-by-token with proper backpressure handling, tool call ID sanitization has been fixed, and unexpected stream terminations now correctly report error status instead of silently succeeding.

Getting started:

npm install @cloudflare/tanstack-ai @tanstack/ai
npm install workers-ai-provider@latest ai

All three updates enable developers to build production-grade AI applications that run entirely on Cloudflare's edge network with framework flexibility and expanded multimodal capabilities.

GLM-4.7-Flash: New Multilingual Model for Workers AI

TanStack AI Integration

Enhanced Vercel AI SDK Support

Products

Tags

Published

Source

Related News