Allen Institute for AI logo

Allen Institute for AI

Allen Institute for AI — Open Source / Research

https://allenai.org ↗

Changelogs

RSS
Olmix framework enables efficient data mixing for language model development//AI2 releases Olmix, an open-source framework designed to optimize how training data from multiple sources—web text, code, PDFs, math—are combined during language model development. The framework provides empirically grounded configuration defaults and mixture reuse techniques that reduce compute costs while achieving 12% better downstream performance.
open-sourcefeatureplatform
Allen AI releases AutoDiscovery, automated hypothesis generation tool now available in AstaLabs platform//AutoDiscovery is an AI-powered system that autonomously generates and tests scientific hypotheses on structured datasets, surfacing novel research directions without requiring researchers to specify questions in advance. The tool uses Bayesian surprise and Monte Carlo Tree Search to prioritize high-value experiments, and is now available as an experimental feature in the Asta platform.
releasefeatureplatformapi
AI2 launches MolmoSpaces, an open simulation platform for embodied AI with 230,000 scenes and 42 million grasps//MolmoSpaces is a large-scale, open-source ecosystem for training and evaluating embodied AI systems, unifying over 230,000 indoor scenes, 130,000+ object models, and 42 million annotated robotic grasps. The platform features physics-grounded simulation, a systematic benchmark for measuring generalization across multiple axes, and compatibility with major simulators like MuJoCo, ManiSkill, and NVIDIA Isaac.
releaseopen-sourceplatformfeature