Alibaba releases Qwen3.5 model family with sizes from 0.8B to 397B parameters

Qwen3.5 Model Family

Alibaba has launched Qwen3.5, a comprehensive model family designed to serve diverse deployment scenarios. The lineup includes:

Large models: 35B-A3B, 27B, 122B-A10B, and 397B-A17B parameters
Small models: 0.8B, 2B, 4B, and 9B parameters
Multimodal capabilities: Hybrid reasoning LLMs supporting vision, text, and agentic coding tasks

Key Features

Context & Language Support

256K context window (extendable to 1M via YaRN)
Multilingual support across 201 languages
Supports up to 32,768 output tokens

Reasoning Capabilities

Hybrid thinking and non-thinking modes for flexible inference
Thinking mode optimized for complex reasoning tasks
Non-thinking (Instruct) mode for faster, direct responses
Reasoning disabled by default on Small models (0.8B-9B)

Hardware Requirements The models support multiple quantization levels with varying memory footprints:

35B-A3B: 22GB (4-bit) on compatible devices like high-end Macs
27B: 17GB (4-bit)
Small models (0.8B-9B): As low as 3GB (3-bit) to 19GB (BF16)

Deployment & Optimization

All model uploads use Unsloth Dynamic 2.0 quantization, which intelligently upcasts important layers to 8 or 16-bit precision within 4-bit quantization for superior performance. GGUF variants are available for llama.cpp-compatible backends (currently not compatible with Ollama).

Fine-tuning support is available through Unsloth, and comprehensive inference tutorials are provided for each model size. Developers can control reasoning behavior via chat template parameters (enable_thinking flag).

Recent Updates

A March 2 update addressed tool-calling improvements following chat template fixes, with benefits applying universally across all Qwen3.5 formats and uploaders. MXFP4 layers have been retired from select quantization variants (Q2_K_XL, Q3_K_XL, Q4_K_XL) based on quantization sensitivity analysis.

Qwen3.5 Model Family

Key Features

Deployment & Optimization

Recent Updates

Tags

Published

Source