DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.

Features

  • Uses a model trained by machine learning techniques
  • Based on Baidu's Deep Speech research paper
  • Uses Google's TensorFlow to make the implementation easier
  • A pre-trained English model is available for use
  • Download important inference material from the DeepSpeech releases page
  • Run in real time on all devices

Project Samples

Project Activity

See All Activity >

License

Mozilla Public License 2.0 (MPL 2.0)

Follow DeepSpeech

DeepSpeech Web Site

Other Useful Business Software
Go from Code to Production URL in Seconds Icon
Go from Code to Production URL in Seconds

Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
Try it free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DeepSpeech!