← Back
Ollama v0.15.5 adds two new models and improves agentic coding support
· releasefeatureapimodelbugfix · github.com ↗

New Models

Ollama v0.15.5 introduces two new models to the library:

  • Qwen3-Coder-Next: A coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development environments.
  • GLM-OCR: A multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture for advanced text and layout recognition.

Improvements to Agentic Features

The ollama launch command receives significant enhancements for agent-based workflows:

  • Sub-agent support: ollama launch can now spawn and manage sub-agents for planning, deep research, and similar multi-step tasks.
  • Flexible arguments: Arguments can now be passed through ollama launch, e.g., ollama launch claude -- --resume.
  • Model-specific context tuning: Context limits are automatically set for specific models (e.g., ollama launch opencode).

System Improvements

  • Automatic context length tuning: Ollama now defaults context lengths based on available VRAM:
    • Less than 24 GiB: 4,096 tokens
    • 24–48 GiB: 32,768 tokens
    • 48 GiB or more: 262,144 tokens
  • Simplified authentication: ollama signin now opens a browser window directly to the connection page.
  • MLX engine expansion: Added support for GLM-4.7-Flash on the experimental MLX engine.

Bug Fixes

  • Fixed an off-by-one error when using the num_predict API parameter.
  • Resolved issue where tokens from previous sequences would be incorrectly returned when hitting num_predict limits.