smart_toyAI/PROMPT ENGINEERING

Voxtral transcribes at the speed of sound

sourceSimon Willison

calendar_todayFebruary 4, 2026

schedule2 min read

lightbulb

EXECUTIVE SUMMARY

Voxtral Transcribe 2: Revolutionizing Audio Transcription with Speed and Accuracy

Summary

Mistral has launched Voxtral Transcribe 2, a new family of audio-to-text transcription models, including an open weights version. This sequel to the original Voxtral, released in July 2025, showcases impressive transcription capabilities.

Key Points

Release Date: Voxtral Transcribe 2 was released recently, following the original Voxtral from July 2025.
Models: The family includes Voxtral Realtime (open weights) and a closed weight model named voxtral-mini-latest.
Open Weights Model: Voxtral Realtime (Voxtral-Mini-4B-Realtime-2602) is available as an 8.87GB download under Apache-2.0 license from Hugging Face.
Demo: A live demo allows users to experience transcription capabilities, even with complex jargon.
API Access: The closed weight model can be accessed via the Mistral API, with specific curl commands for transcription tasks.
Pricing: The API pricing is set at $0.003 per minute, equating to $0.18 per hour.
Features: The Mistral API console includes a speech-to-text playground for testing, offering diarized transcripts and various download formats (text, SRT, JSON).

Analysis

The introduction of Voxtral Transcribe 2 marks a significant advancement in AI-driven transcription technology, enhancing the speed and accuracy of audio-to-text conversion. This is particularly relevant for industries relying on real-time transcription services, such as media, education, and customer support.

Conclusion

IT professionals should explore the capabilities of Voxtral Transcribe 2 for applications in their organizations, particularly in automating transcription tasks and improving accessibility in communication.