← Back
Allen AI releases MolmoWeb, an open-source web agent model; achieves competitive performance without proprietary training data
· releasemodelopen-sourcefeature · allenai.org ↗

What's New

MolmoWeb is an open-source visual web agent that can navigate and complete tasks in a web browser by interpreting screenshots the same way humans do. Unlike proprietary web agents, MolmoWeb was trained entirely on open data—synthetic trajectories from text-based agents and human demonstrations—without distillation from closed-source systems.

The system operates in a simple loop: observe the current webpage screenshot, reason about the next action, and execute browser commands (clicking, typing, scrolling, navigating). The agent produces natural-language thoughts explaining its reasoning before each action, maintaining transparency throughout the interaction.

Key Components

Allen AI is releasing a complete stack for web agent development:

  • Models: Two sizes (4B and 8B parameters) optimized for self-hosted deployment
  • Training Data (MolmoWebMix): A large open dataset combining:
    • 30K human task trajectories (590K+ subtask demonstrations) from crowdworkers across 1.1K+ websites
    • Synthetic trajectories generated from accessibility-tree agents for scale
    • 2.2M+ screenshot question-answer pairs for GUI perception
  • Infrastructure: Training code, evaluation pipelines, and data collection tools
  • Evaluation: Benchmarks on WebVoyager, Online-Mind2Web, DeepShop, and WebTailBench

Why This Matters

Previous web agents relied on undisclosed proprietary data and methods, limiting reproducibility and community research. By releasing models, training data, and infrastructure together, MolmoWeb establishes an open foundation for browser automation research—analogous to what Olmo did for language models. The visual-first approach (operating on screenshots rather than HTML) makes agents easier to debug and adaptable to any website without custom APIs.

Getting Started

All materials are available on Hugging Face and GitHub, including model weights, datasets, technical report, code, and a live demo at molmoweb.allen.ai.