New Audio Transcription Capability
Cohere has announced the release of Cohere Transcribe, its first dedicated automatic speech recognition (ASR) model. This new offering extends Cohere's platform beyond text-based AI into audio processing, enabling developers to build speech-to-text functionality directly into their applications.
Supported Languages and Technical Specifications
The model supports 14 languages covering major global markets:
- European languages: English, German, French, Italian, Spanish, Portuguese, Greek, Dutch, Polish
- Asian languages: Vietnamese, Chinese, Japanese, Korean
- Middle Eastern: Arabic
The model is available as cohere-transcribe-03-2026 and accepts audio waveforms as input, outputting clean transcribed text. The implementation is open-source under the Apache 2.0 license.
API Integration and Deployment Options
Developers can immediately begin using the model through Cohere's Audio Transcriptions API endpoint. The Python SDK provides straightforward integration:
import cohere
co = cohere.ClientV2()
response = co.audio.transcriptions.create(
model="cohere-transcribe-03-2026",
language="en",
file=open("./sample.wav", "rb"),
)
Availability and Pricing
The model is available for free experimentation through Cohere's standard API with rate limits applied. For production workloads requiring higher throughput and lower latency, Cohere offers Model Vault deployment—a managed, private cloud inference option with pricing based on hourly instance usage and discounted rates for longer-term commitments. Enterprise customers should contact Cohere's sales team to discuss custom requirements.