Ollama v0.17.5 fixes Qwen 3.5 crashes and MLX runner memory issues | StandardDB

Ollama v0.17.5 fixes Qwen 3.5 crashes and MLX runner memory issues

· releasebugfixperformance · github.com ↗

Bug Fixes and Improvements

This patch release focuses on stability and performance improvements across Ollama's model runners:

Qwen 3.5 Model Fixes

Crash prevention: Fixed a critical crash that occurred when Qwen 3.5 models were split between GPU and CPU resources
Output quality: Resolved an issue causing Qwen 3.5 models to repeat themselves due to missing presence penalty
Model compatibility: Fixed support for models imported from Qwen 3.5 GGUF files

MLX Engine Enhancements

Improved memory management to eliminate crashes and resource leaks in the MLX runner
Enhanced ollama run --verbose to display peak memory usage statistics when using Ollama's MLX engine

Action Items for Users

Users currently running Qwen 3.5 models should redownload them to apply the presence penalty fix, which improves generation quality:

ollama pull qwen3.5:35b

This release is available across all platforms including macOS, Linux (AMD64, ARM64), and Windows with support for ROCM accelerators.

Tags

releasebugfixperformance

Published

today

Source