Whisper-large-v3-turbo is a high-performance automatic speech recognition (ASR) and translation model developed by OpenAI, based on a pruned version of Whisper large-v3. It reduces decoding layers from 32 to 4, offering significantly faster inference with only minor degradation in accuracy. Trained on over 5 million hours of multilingual data, it handles speech transcription, translation, and language identification across 99 languages. It supports advanced decoding strategies like beam search, temperature fallback, and timestamp prediction. Whisper-large-v3-turbo works with long-form and real-time audio, with chunked or sequential inference options. Optimizations such as torch.compile, Flash Attention 2, and SDPA are available to enhance performance. Despite being smaller than v3, it retains strong robustness to accents, noise, and zero-shot tasks, making it ideal for scalable, multilingual ASR use cases.