NVIDIA Launches Free GPU-Accelerated Endpoints for Alibaba's Qwen3.5 397B Vision-Language Model

Qwen3.5 Now Available on NVIDIA Infrastructure

Alibaba has released Qwen3.5, a 397-billion parameter native vision-language model designed for multimodal agents. The model uses a hybrid architecture combining mixture of experts (MoE) with Gated Delta Networks, achieving an impressive 4.28% activation rate with only 17B active parameters per token. It supports 256K token context windows (extensible to 1M), covers 200+ languages, and can understand and navigate complex user interfaces.

Immediate Access for Developers

Developers can start building with Qwen3.5 immediately through multiple channels:

Free GPU-accelerated endpoints on build.nvidia.com powered by NVIDIA Blackwell GPUs, available to registered NVIDIA Developer Program members
API access through NVIDIA's hosted endpoints with free usage tier
Full code examples and OpenAI-compatible chat completion APIs for rapid integration

The model excels at coding tasks, visual reasoning over mobile and web interfaces, chat applications, and complex search scenarios.

Production Deployment and Customization

For production use, NVIDIA NIM provides containerized inference microservices with optimized performance, standardized APIs, and deployment flexibility across on-premises, cloud, and hybrid environments.

The NVIDIA NeMo framework enables fine-tuning for specialized domains. Key capabilities include:

PyTorch-native training with Day 0 Hugging Face checkpoint support (no conversion needed)
Memory-efficient methods like LoRA for cost-effective adaptation
Multinode deployment on Slurm and Kubernetes for large-scale training
Reference implementation for medical visual QA and radiological dataset fine-tuning

Getting Started

Developers can access Qwen3.5 immediately on build.nvidia.com, experiment with prompts, and test against their own data. Integration with existing NVIDIA infrastructure (Blackwell GPUs, NIM, NeMo) enables seamless scaling from prototyping to enterprise production workloads.

Qwen3.5 Now Available on NVIDIA Infrastructure

Immediate Access for Developers

Production Deployment and Customization

Getting Started

Tags

Published

Source