Whisper

Whisper

OpenAI
+
+

Related Products

  • Google Cloud Speech-to-Text
    374 Ratings
    Visit Website
  • Vertex AI
    827 Ratings
    Visit Website
  • RunPod
    205 Ratings
    Visit Website
  • QEval
    30 Ratings
    Visit Website
  • PackageX OCR Scanning
    46 Ratings
    Visit Website
  • Assembled
    233 Ratings
    Visit Website
  • Cloudflare
    1,918 Ratings
    Visit Website
  • Podium
    2,094 Ratings
  • Ango Hub
    15 Ratings
    Visit Website
  • Forethought
    164 Ratings
    Visit Website

About

Deploy accurate speech recognition at scale while continuously improving model performance by labeling data and training from a single console. We deliver state-of-the-art speech recognition and understanding at scale. We do it by providing cutting-edge model training and data-labeling alongside flexible deployment options. Our platform recognizes multiple languages, accents, and words, dynamically tuning to the needs of your business with every training session. The fastest, most accurate, most reliable, most scalable speech transcription, with understanding — rebuilt just for enterprise. We’ve reinvented ASR with 100% deep learning that allows companies to continuously improve accuracy. Stop waiting for the big tech players to improve their software and forcing your developers to manually boost accuracy with keywords in every API call. Start training your speech model and reaping the benefits in weeks, not months or years.

About

We’ve trained and are open-sourcing a neural net called Whisper that approaches human-level robustness and accuracy in English speech recognition. Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise, and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing. The Whisper architecture is a simple end-to-end approach, implemented as an encoder-decoder Transformer. Input audio is split into 30-second chunks, converted into a log-Mel spectrogram, and then passed into an encoder.

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Audience

Companies looking for Speech to Text (STT) API for real-time and batch transcriptions, on premise or in the cloud.

Audience

Anyone looking for a tool to recognize speech automatically and improve text transcription

Support

Phone Support
24/7 Live Support
Online

Support

Phone Support
24/7 Live Support
Online

API

Offers API

API

Offers API

Screenshots and Videos

Screenshots and Videos

Pricing

$0
Free Version
Free Trial

Pricing

No information available.
Free Version
Free Trial

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Training

Documentation
Webinars
Live Online
In Person

Training

Documentation
Webinars
Live Online
In Person

Company Information

Deepgram
Founded: 2015
United States
deepgram.com

Company Information

OpenAI
United States
openai.com/blog/whisper/

Alternatives

Alternatives

Transcribe

Transcribe

Wreally
Azure AI Speech

Azure AI Speech

Microsoft

Categories

Categories

Speech Recognition Features

Audio Capture
Automatic Form Fill
Automatic Transcription
Call Analysis
Concatenated Speech
Continuous Speech
Customizable Macros
Multi-Languages
Specialty Vocabularies
Speech-to-Text Analysis
Variable Frequency
Voice Recognition

Transcription Features

AI / Machine Learning
Annotations
Audio/Video File Upload
Automatic Transcription
Collaboration Tools
File Sharing
For Manual Transcription
Full Text Search
Multi-Language Support
Natural Language Processing (NLP)
Playback Controls
Speech Recognition
Subtitles
Text Editor
Timecoding

Integrations

Bolna
MacWhisper
Unremot
Utterly Voice
Vocode
Astro
ContactSwing
Creovai
Docker
Fluents.ai
Genesys Cloud CX
Google Cloud Platform
Krater.ai
NVIDIA DRIVE
Nova-3
Pruna AI
ReByte
TurboScribe
Zo

Integrations

Bolna
MacWhisper
Unremot
Utterly Voice
Vocode
Astro
ContactSwing
Creovai
Docker
Fluents.ai
Genesys Cloud CX
Google Cloud Platform
Krater.ai
NVIDIA DRIVE
Nova-3
Pruna AI
ReByte
TurboScribe
Zo
Claim Deepgram and update features and information
Claim Deepgram and update features and information
Claim Whisper and update features and information
Claim Whisper and update features and information