← Back
NVIDIA
NVIDIA releases Cosmos Transfer 2.5, Predict 2.5, and Reason 2 for synthetic data generation and physical AI reasoning
· releasefeaturemodelapi · developer.nvidia.com ↗

NVIDIA Cosmos World Foundation Models Get Major Updates

NVIDIA has announced three significant updates to its Cosmos world foundation model platform, one year after their initial introduction. These models address the critical challenge of generating high-fidelity, physics-aware training data for AI-driven robots and autonomous vehicles.

Cosmos Transfer 2.5: Photorealistic Synthetic Data Generation

Cosmos Transfer 2.5 enables faster and more scalable data augmentation from simulation and 3D spatial inputs. The model uses ControlNet architecture to preserve pretrained knowledge and generate photorealistic video sequences with controlled composition. Key capabilities include:

  • Accepts structured visual inputs (segmentation maps, depth maps, edge maps, human motion keypoints, LiDAR scans, HD maps, and 3D bounding boxes)
  • Generates high-fidelity, physics-grounded synthetic data with diverse environments and lighting conditions
  • Enables fine-grained control over scene composition, object placement, and motion dynamics
  • Integrates with NVIDIA Omniverse for ground truth simulation inputs

Cosmos Predict 2.5: Enhanced Scenario Generation

Cosmos Predict 2.5 delivers improved long-tail scenario generation for video sequences up to 30 seconds. The model achieves up to 10x higher accuracy when post-trained on proprietary or domain-specific data. New features include:

  • Multiview output support with custom camera layouts
  • Alternative policy outputs such as action simulation
  • Enhanced accuracy for edge cases and rare scenarios

Cosmos Reason 2: Advanced Physical AI Reasoning

Cosmos Reason 2 introduces significantly improved spatiotemporal understanding for complex reasoning tasks. Major enhancements include:

  • Advanced chain-of-thought reasoning for context-aware decision-making
  • Object detection with 2D/3D point localization and bounding box coordinates
  • Support for tasks like object localization and motion prediction
  • Expanded long-context support up to 256K input tokens
  • Improved timestamp precision for temporal reasoning

Developer Resources and Integration

NVIDIA provides the NVIDIA Cosmos Cookbook with step-by-step workflows and technical recipes for building, adapting, and deploying these world foundation models. The models integrate with NVIDIA Omniverse (built on OpenUSD) for creating and simulating 3D environments that serve as ground truth inputs.