Unsloth

Unsloth — AI Infrastructure

Changelogs

Alibaba releases Qwen3.5 model family; 397B variant matches Gemini 3 Pro and Claude Opus 4.5//Alibaba's Qwen3.5 is a new multimodal model family ranging from 27B to 397B parameters with mixture-of-experts architecture. The flagship 397B-A17B variant supports 256K context (expandable to 1M), hybrid thinking/non-thinking modes, and excels in coding, vision, and long-context reasoning, with performance comparable to leading proprietary models.

releasemodelfeatureopen-source

Unsloth ships MoE training kernels with 12x speedup and 35% lower VRAM usage//Unsloth introduced custom Triton kernels and optimizations for training Mixture of Experts (MoE) language models, delivering 12x faster training speeds with over 35% reduction in VRAM consumption and support for 6x longer context windows. The update supports popular MoE models including Qwen3, DeepSeek R1/V3, and GPT-OSS, working across data-center and consumer GPUs.

featureperformancesdkopen-source