← Back
AI2 launches MolmoSpaces, an open simulation platform for embodied AI with 230,000 scenes and 42 million grasps
· releaseopen-sourceplatformfeature · allenai.org ↗

A Unified Platform for Embodied AI Research

AI2 has launched MolmoSpaces, a comprehensive open ecosystem designed to accelerate embodied AI research by providing the data, assets, and benchmarking infrastructure that the field has lacked. The platform unifies over 230,000 indoor scenes and more than 130,000 object models—curated from Objaverse and AI2's THOR environment—with over 42 million annotated robotic grasps. All assets are provided in standard formats (MJCF and USD) and are compatible with popular simulators including MuJoCo, ManiSkill, and NVIDIA Isaac Lab/Sim.

High-Fidelity Physics and Realistic Simulation

A key differentiator of MolmoSpaces is its commitment to physics accuracy. Unlike earlier environments that used simplified physics and "magic grasps," MolmoSpaces leverages physics engines with carefully validated physical parameters. For rigid objects, mass and density are verified through LLM-assisted annotation and adjusted iteratively. Articulated objects undergo teleoperation-based tuning to ensure realistic joint behavior, with the simulation itself validated against real robotic trajectories. All colliders and mesh preparation are manually annotated for stable contact-rich simulation, using convex decomposition for high-fidelity contact where needed.

Systematic Benchmarking for Generalization

MolmoSpaces includes MolmoSpaces-Bench, a comprehensive benchmark that enables researchers to measure policy generalization systematically across multiple axes: object properties (shape, size, weight, articulation), layouts (multi-room, multi-floor, clutter), task complexity, sensory conditions (lighting, viewpoints), dynamics, and task semantics. This controlled-variation approach lets researchers probe robustness to specific factors—such as grasp performance across object masses or policy resilience to lighting changes—rather than reporting a single aggregate success rate. The benchmark also supports sim-to-real validation through real-world testing.

Scale and Modularity

The platform's assets span both hand-curated THOR objects and carefully filtered Objaverse assets (129,000 objects across ~3,000 WordNet synsets). Scenes come from multiple sources—iTHOR-120, ProcTHOR-10K, ProcTHOR-Objaverse, and Holodeck—representing homes, offices, classrooms, hospitals, and museums. All 42+ million grasps are 6-DoF poses sampled directly from realistic object geometry, tested for diversity and robustness through perturbation and actuation analysis. An accompanying trajectory-generation pipeline enables reproducible demonstrations at scale.

Getting Started

MolmoSpaces is fully open-source and modular, with code on GitHub, datasets on Hugging Face, a technical report, and a live demo. Researchers can inspect and modify MJCF files, regenerate grasps, swap robots and controllers, and use the platform across multiple simulators—providing the foundation for large-scale embodied AI research.