Ollama v0.15.5 adds two new models and improves agentic coding support | StandardDB

Ollama v0.15.5 adds two new models and improves agentic coding support

· releasefeatureapimodelbugfix · github.com ↗

New Models

Ollama v0.15.5 introduces two new models to the library:

Qwen3-Coder-Next: A coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development environments.
GLM-OCR: A multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture for advanced text and layout recognition.

Improvements to Agentic Features

The ollama launch command receives significant enhancements for agent-based workflows:

Sub-agent support: ollama launch can now spawn and manage sub-agents for planning, deep research, and similar multi-step tasks.
Flexible arguments: Arguments can now be passed through ollama launch, e.g., ollama launch claude -- --resume.
Model-specific context tuning: Context limits are automatically set for specific models (e.g., ollama launch opencode).

System Improvements

Automatic context length tuning: Ollama now defaults context lengths based on available VRAM:
- Less than 24 GiB: 4,096 tokens
- 24–48 GiB: 32,768 tokens
- 48 GiB or more: 262,144 tokens
Simplified authentication: ollama signin now opens a browser window directly to the connection page.
MLX engine expansion: Added support for GLM-4.7-Flash on the experimental MLX engine.

Bug Fixes

Fixed an off-by-one error when using the num_predict API parameter.
Resolved issue where tokens from previous sequences would be incorrectly returned when hitting num_predict limits.

Tags

releasefeatureapimodelbugfix

Published

3 weeks ago

Source