StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech translation, and TTS, as well as their streaming or simultaneous counterparts, all handled by the same underlying system. During simultaneous translation, StreamSpeech can optionally output intermediate ASR transcripts and text translations, giving users or downstream applications real-time visibility into what the system is hearing and how it is translating.

Features

  • Unified model for ASR, speech translation, and TTS in both offline and streaming modes
  • Supports eight distinct tasks including simultaneous S2ST, S2TT, and real-time TTS
  • Outputs intermediate transcripts and translations for richer low-latency interaction
  • SimulEval integration and agent scripts for systematic streaming evaluation
  • Web GUI demo and project page with audio samples and visualizations
  • Achieves state-of-the-art performance on offline and simultaneous speech-to-speech translation

Project Samples

Project Activity

See All Activity >

Categories

Text to Speech

License

MIT License

Follow StreamSpeech

StreamSpeech Web Site

Other Useful Business Software
AI-generated apps that pass security review Icon
AI-generated apps that pass security review

Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
Try Retool free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of StreamSpeech!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

2025-11-28