IBM releases Granite 4.0 1B Speech: 50% smaller model with higher accuracy and multilingual support

Granite 4.0 1B Speech Release

IBM has released Granite 4.0 1B Speech, a lightweight speech-language model optimized for automatic speech recognition (ASR) and bidirectional speech translation (AST) on resource-constrained edge devices.

Key Improvements

The new model achieves notable efficiency gains compared to its predecessor:

50% parameter reduction: Granite 4.0 1B Speech contains only 1 billion parameters, down from the 2 billion in granite-speech-3.3-2b
Higher accuracy: Despite the smaller footprint, the model delivers improved English transcription accuracy
Faster inference: Speculative decoding enables quicker processing on edge devices
Expanded language support: Now covers English, French, German, Spanish, Portuguese, and Japanese

New Capabilities

This release introduces two frequently requested community features:

Japanese ASR support: Full automatic speech recognition in Japanese
Keyword list biasing: Improved recognition of names and acronyms through custom keyword hints

Performance and Availability

Granite 4.0 1B Speech has achieved the top ranking on the OpenASR leaderboard, demonstrating competitive performance across standard English ASR benchmarks measured by Word Error Rate (WER). The model is released under an Apache 2.0 license with native support in Hugging Face Transformers and vLLM.

For production deployments, IBM recommends pairing this model with Granite Guardian for additional risk detection capabilities. Full evaluation results, architecture details, training data, and usage examples are available on the model card.

Granite 4.0 1B Speech Release

Key Improvements

New Capabilities

Performance and Availability

Tags

Published

Source