Qwen3-ASR is an open-source series of ASR models
Omnilingual ASR Open-Source Multilingual SpeechRecognition
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
kaldi-asr/kaldi is the official location of the Kaldi project
Bailing is a voice dialogue robot similar to GPT-4o
StreamSpeech is a seamless model for offline speech recognition
Audio foundation model excelling in audio understanding
Real-time voice interactive digital human
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Speech-AI-Forge is a project developed around TTS generation model
Port of OpenAI's Whisper model in C/C++
Video translation and dubbing tool powered by LLMs
End-to-end speech processing toolkit
Repo of Qwen2-Audio chat & pretrained large audio language model
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Scalable generative AI framework built for researchers and developers
Open-source framework for intelligent speech interaction
Fast and accurate automatic speech recognition (ASR) for edge devices
A library for audio and music analysis, feature extraction
Conversational voice AI agents
Speech Note Linux app. Note taking, reading and translating
Open source AI VTuber platform with voice chat and Live2D avatars
Framework for building AI-powered interactive digital humans and agent
HTML5 js recording mp3 wav ogg webm amr format
Open-source industrial-grade ASR models