GLM-4.7-Flash: New Multilingual Model for Workers AI
Cloudflare has introduced GLM-4.7-Flash, a fast and efficient multilingual text generation model optimized for dialogue, instruction-following, and agent applications. The model features a 131,072 token context window, making it suitable for long-form content generation, complex reasoning, and processing lengthy documents.
Key capabilities include:
- Multi-turn tool calling for building AI agents that invoke functions across conversation turns
- Native multilingual support for content generation across multiple languages
- Fast inference optimized for low-latency chatbots and virtual assistants
- Excellent instruction-following for code generation and structured tasks
Access the model through Workers AI bindings (env.AI.run()), REST endpoints (/run or /v1/chat/completions), AI Gateway, or the Vercel AI SDK via the workers-ai-provider.
TanStack AI Integration
The new @cloudflare/tanstack-ai package brings Workers AI and AI Gateway support to TanStack AI, offering a framework-agnostic alternative for developers preferring TanStack's approach. The adapters support four configuration modes: plain binding, plain REST, AI Gateway binding, and AI Gateway REST.
Available adapters:
- Chat (
createWorkersAiChat) — Streaming completions with tool calling and structured output - Image generation (
createWorkersAiImage) — Text-to-image models - Transcription (
createWorkersAiTranscription) — Speech-to-text processing - Text-to-speech (
createWorkersAiTts) — Audio generation - Summarization (
createWorkersAiSummarize) — Text summarization
AI Gateway adapters also route requests from OpenAI, Anthropic, Gemini, Grok, and OpenRouter through Cloudflare's infrastructure for caching, rate limiting, and unified billing.
Enhanced Vercel AI SDK Support
The workers-ai-provider v3.1.1 release adds three new capabilities to the Vercel AI SDK integration: transcription (speech-to-text), text-to-speech (audio generation), and reranking (document reordering for RAG pipelines and search results).
The update also includes a comprehensive reliability overhaul addressing streaming, tool calling, and premature stream termination. Streams now process token-by-token with proper backpressure handling, tool call ID sanitization has been fixed, and unexpected stream terminations now correctly report error status instead of silently succeeding.
Getting started:
npm install @cloudflare/tanstack-ai @tanstack/ai
npm install workers-ai-provider@latest ai
All three updates enable developers to build production-grade AI applications that run entirely on Cloudflare's edge network with framework flexibility and expanded multimodal capabilities.