Voxtral transcribes at the speed of sound
EXECUTIVE SUMMARY
Voxtral Transcribe 2: Revolutionizing Audio Transcription with Speed and Accuracy
Summary
Mistral has launched Voxtral Transcribe 2, a new family of audio-to-text transcription models, including an open weights version. This sequel to the original Voxtral, released in July 2025, showcases impressive transcription capabilities.
Key Points
- Release Date: Voxtral Transcribe 2 was released recently, following the original Voxtral from July 2025.
- Models: The family includes Voxtral Realtime (open weights) and a closed weight model named voxtral-mini-latest.
- Open Weights Model: Voxtral Realtime (Voxtral-Mini-4B-Realtime-2602) is available as an 8.87GB download under Apache-2.0 license from Hugging Face.
- Demo: A live demo allows users to experience transcription capabilities, even with complex jargon.
- API Access: The closed weight model can be accessed via the Mistral API, with specific curl commands for transcription tasks.
- Pricing: The API pricing is set at $0.003 per minute, equating to $0.18 per hour.
- Features: The Mistral API console includes a speech-to-text playground for testing, offering diarized transcripts and various download formats (text, SRT, JSON).
Analysis
The introduction of Voxtral Transcribe 2 marks a significant advancement in AI-driven transcription technology, enhancing the speed and accuracy of audio-to-text conversion. This is particularly relevant for industries relying on real-time transcription services, such as media, education, and customer support.
Conclusion
IT professionals should explore the capabilities of Voxtral Transcribe 2 for applications in their organizations, particularly in automating transcription tasks and improving accessibility in communication.