Showing 117 open source projects for "voice recognition"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 1
    TTS Voice Wizard

    TTS Voice Wizard

    Speech to Text to Speech, sends text as OSC messages

    Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) Use TTS Voice Wizard's accessibility features to improve your VRChat experience (it works outside of VRChat too!) You can convert your Speech-to-Text and back to Speech through various Speech Recognition and Text-to-Speech methods. You can send what you say as OSC messages to VRChat to be displayed on your avatar using KillFrenzyAvatarText or VRChats...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    GLM-4-Voice

    GLM-4-Voice

    GLM-4-Voice | End-to-End Chinese-English Conversational Model

    GLM-4-Voice is an open-source speech-enabled model from ZhipuAI, extending the GLM-4 family into the audio domain. It integrates advanced voice recognition and generation with the multimodal reasoning capabilities of GLM-4, enabling smooth natural interaction via spoken input and output. The model supports real-time speech-to-text transcription, spoken dialogue understanding, and text-to-speech synthesis, making it suitable for conversational AI, virtual assistants, and accessibility...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    XiaoZhi AI Chatbot

    XiaoZhi AI Chatbot

    Build your own AI friend

    xiaozhi-esp32 is an open-source project that guides users in building their own AI-powered conversational companion using the ESP32 microcontroller. The project provides detailed instructions on assembling the hardware, setting up the software, and integrating AI models to enable natural language interactions. This DIY approach offers an accessible entry point into AI and hardware development.
    Downloads: 273 This Week
    Last Update:
    See Project
  • 4
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...
    Downloads: 69 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 6
    Alan AI

    Alan AI

    In-App assistant SDK to build a multimodal conversational UX websites

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. To voice enable your app, you only need to get the Alan Client SDK and drop it to your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    Parlant

    Parlant

    The behavior guidance framework for customer-facing LLM agents

    Parlant is a lightweight speech-to-text and text-to-speech framework designed for real-time AI-driven voice applications.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 10
    Handy STT

    Handy STT

    A free, open source, and extensible speech-to-text application

    ... field. Its backend leverages OpenAI’s Whisper models for GPU-accelerated speech recognition and Parakeet V3 for efficient CPU-only transcription with automatic language detection. To further refine accuracy and responsiveness, Handy integrates Silero’s Voice Activity Detection (VAD) for silence filtering, ensuring only speech segments are processed.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Leku

    Leku

    Map location picker component for Android

    Map location picker component for Android. Based on Google Maps. An alternative to Google Place Picker. Component library for Android that uses Google Maps and returns a latitude, longitude and an address based on the location picked with the Activity provided. Note that you have the voice_search_extra_language that is used for the language of the voice recognition. Replace it with the allowed voice recognition locale for your language. We encourage you to add these languages to this component...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    TEN Framework

    TEN Framework

    TEN, a voice agent framework to create conversational AI.

    TEN (Transformative Extensions Network) is a voice agent framework for creating conversational AI applications, focusing on high performance and modularity.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Bolna

    Bolna

    Conversational voice AI agents

    Bolna is an end-to-end open-source platform for building conversational voice AI agents, enabling developers to create voice-first conversational assistants efficiently.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Recorder

    Recorder

    HTML5 js recording mp3 wav ogg webm amr format

    ... of browser (including PWA, WebClip, any App) on low-version iOS (11.0-14.2) except Safari inside page). Provides multiple plug-in function support. Rich audio visualization, variable speed and pitch processing, speech recognition, audio stream playback, etc.; with powerful real-time processing support, it can be used in various web applications: from simple recording to complex real-time voice Recognition (ASR), and even audio-related games, are handled with ease.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Alan AI for iOS

    Alan AI for iOS

    In-App assistant SDK to build a multimodal conversational UX for iOS

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. Voice enable your app, you only need to get the Alan Client SDK and drop it into your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Alan AI for Android

    Alan AI for Android

    Assistant SDK to build a multimodal conversational UX for Android

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. Voice enable your app, you only need to get the Alan Client SDK and drop it into your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Vocode

    Vocode

    Build voice-based LLM agents. Modular + open source

    Vocode is an open source library that makes it easy to build voice-based LLM apps. Using Vocode, you can build real-time streaming conversations with LLMs and deploy them to phone calls, Zoom meetings, and more. You can also build personal assistants or apps like voice-based chess. Vocode provides easy abstractions and integrations so that everything you need is in a single library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    OpenaiBot

    OpenaiBot

    Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant

    If you don't have the instant messaging platform you need or you want to develop a new application, you are welcome to contribute to this repository. You can develop a new Controller by using Event.py. Compatibility with multiple LLMs and integration with GPT and third-party systems is handled by our llm-kira project on GitHub. It can accurately limit billing, with limits and ID binding. Supports asynchronous operations and can handle multiple requests simultaneously. Allows for private and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Mice MX OS speech to text Voice Control

    Mice MX OS speech to text Voice Control

    Mice speech to text with MX Cinnamon OS ISO

    Note about this image This image contains a system based on Linux MX, which was created to improve accessibility within the Linux environment. The distribution uses the Cinnamon desktop interface, which is configured to be operated using voice commands and outputs. The user interface and the control of your own devices and home automation systems can be customized and extended. The voice control program MiceStTM.py was developed to enable easy adaptation to other languages. However, only...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    ATC-pie

    ATC-pie

    Air traffic control tower and radar simulator (solo + multi-player)

    ATC-pie is an air traffic control simulation program. It features solo, multi-player and teacher-student sessions, rendering 3D views of airports through FlightGear. It is essentially designed for realism, and simulates real-life ATC tasks and equipment such as strip racks and sequence management, handovers to/from neighbouring controllers, flight plans, primary & secondary radars, RDF, CPDLC, ATIS recording...
    Leader badge
    Downloads: 134 This Week
    Last Update:
    See Project
  • 21
    Scribe

    Scribe

    Free, open-source, and offline speech-to-text & voice control app.

    > Scribe is a free and open-source desktop assistant that brings powerful speech-to-text and voice control capabilities directly to your PC. It allows you to dictate text into any application, create custom voice commands, launch programs, and automate your workflow with text replacements. > Designed with privacy as a top priority, Scribe works completely offline. Your voice data never leaves your computer. Powered by the Vosk engine, it supports multiple languages and provides high...
    Leader badge
    Downloads: 46 This Week
    Last Update:
    See Project
  • 22
    KoboldCpp

    KoboldCpp

    Run GGUF models easily with a UI or API. One File. Zero Install.

    KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original KoboldAI. It's a single self-contained distributable that builds off llama.cpp and adds many additional powerful features.
    Downloads: 137 This Week
    Last Update:
    See Project
  • 23
    Maia
    MAIA (MyApp Intelligence Artificial) is designed to provide a foundation for building your own voice-controlled assistant with Python. It uses various libraries and modules for speech recognition, text-to-speech synthesis, and custom functionality.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    cerberuscms2

    cerberuscms2

    Cerberus Content Management System

    Cerberus Content Management System is a dynamic, secure and infinitely expandable CMS designed after a Unix-Like model. It is a custom written Web Application Framework ( W.A.F. ) with a consistent and custom written Pre-Hyper-Text-Post-Processor Programming Code Framework ( P.C.F. ). This Web Application Software Project' aim is to be the fastest and most secure Web Application Framework, Web Application Programming Code Framework, Text, Voice and Video Communications Platform and Content...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 25
    Cerberus Content Management System 6

    Cerberus Content Management System 6

    Cerberus Content Management System

    ..., Voice and Video Communications Platform and Content Management System in the world. The latest project version is programmed with the Pre-Hyper-Text-Post-Processor P.D.O. // P.H.P. Data Objects Driver Programming Code Statements that enables it to work on any Database Management System Server: MySQL Database, Maria Database, MicrosoftSQL Database, MiniSQL Database and More.. Recommended Web Server OS and Private OS: FreeBSD, KDE Plasma Mobile Continuation From @CerberusCMS, @CerberusCMS5
    Downloads: 8 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.