Showing 1270 open source projects for ".mega-voice"

View related business solutions
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    Voice

    Voice

    Minimalistic audiobook player

    This is my digital playground where I am learning. I'm integrating and validating new technologies and ideas here, playing around with new UI / UX components, and developing with the best coding standard I have come up with. At the same time, I want to provide an audiobook player which is really easy in use and a joy to work with.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 3
    clone-voice

    clone-voice

    A sound cloning tool with a web interface, using your voice

    Clone-voice is a local voice-cloning tool that lets you synthesize speech in any target voice or convert one recording into another voice using the same timbre. It is built around Coqui’s XTTS-v2 model, so it inherits multilingual support and modern neural TTS quality while wrapping it in a user-friendly desktop workflow. The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control cloning and synthesis. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Moonshine Voice

    Moonshine Voice

    Fast and accurate automatic speech recognition (ASR) for edge devices

    moonshine is an open-source automatic speech recognition toolkit optimized for fast and accurate transcription on edge devices and local environments. The project is designed to enable real-time voice applications such as live transcription, voice commands, and embedded speech interfaces without requiring heavy cloud infrastructure. Its architecture emphasizes low latency and flexible input handling, allowing audio streams of varying durations rather than relying on fixed processing windows. Moonshine supports multiple platforms including mobile, desktop, and embedded systems, and provides example projects to accelerate integration into real-world products. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    MEGA Android Client

    MEGA Android Client

    MEGA Android App

    A fully-featured client to access your Cloud Storage provided by MEGA.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    TTS Voice Wizard

    TTS Voice Wizard

    Speech to Text to Speech, sends text as OSC messages

    ...The app can translate your speech from one language to over 20 other support languages. There are 100+ different voices with various customization options so you can pick a voice that best suits you. Display the current song you are listening to on Spotify or via your browser. Display tracker and controller battery life in conjunction with XSOverlay. Use in conjunction with HRtoVRChat_OSC to enable you to display your heartrate in VRChat's Chatbox.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    GLM-4-Voice

    GLM-4-Voice

    GLM-4-Voice | End-to-End Chinese-English Conversational Model

    GLM-4-Voice is an open-source speech-enabled model from ZhipuAI, extending the GLM-4 family into the audio domain. It integrates advanced voice recognition and generation with the multimodal reasoning capabilities of GLM-4, enabling smooth natural interaction via spoken input and output. The model supports real-time speech-to-text transcription, spoken dialogue understanding, and text-to-speech synthesis, making it suitable for conversational AI, virtual assistants, and accessibility applications. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Cave Story MD

    Cave Story MD

    A fan port of Cave Story for the Sega Mega Drive

    This is a rewrite/port of the popular freeware game Cave Story for Sega Mega Drive/Genesis. It should work on any console or emulator. The main story is "finished", only little things and bugfixes remain. A fan port of Cave Story for the Sega Mega Drive.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    Voicebox is a local-first voice synthesis studio that aims to bring professional, DAW-like voice generation workflows to a desktop app while keeping models and voice data entirely on your machine. It positions itself as an open-source alternative to cloud voice platforms by emphasizing privacy, offline use, and freedom from subscriptions or usage caps. The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the results using studio-oriented editing concepts. ...
    Downloads: 56 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Real-Time Voice Cloning

    Real-Time Voice Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Real-Time Voice Cloning is an influential deep-learning repository that demonstrates how to clone a voice from just a few seconds of audio and then generate arbitrary speech in that voice in near real time. It implements the SV2TTS pipeline (“Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis”) in three stages: a speaker encoder, a synthesizer, and a vocoder.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    Applio

    Applio

    A simple, high-quality voice conversion tool focused on ease of use

    Applio is a high-quality voice conversion toolkit designed to make modern RVC/VITS-based voice cloning accessible to non-experts. It focuses strongly on ease of use: installation scripts for Windows, Linux, and macOS set up dependencies and then launch a browser-based Gradio interface. Within that interface, users can train and run voice conversion models for tasks like singing conversion, speech-to-speech transformation, and voice cloning.
    Downloads: 88 This Week
    Last Update:
    See Project
  • 12
    GPT-SoVITS

    GPT-SoVITS

    1 min voice data can also be used to train a good TTS model

    GPT‑SoVITS is a state-of-the-art voice conversion and TTS system that enables zero‑shot and few‑shot synthesis based on a short vocal sample (e.g., 5 seconds). It supports cross‑lingual speech synthesis across English, Chinese, Japanese, Korean, Cantonese, and more. It's powered by VITS architecture enhanced for few‑sample adaptation and real‑time usability.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 13
    OpenVoice

    OpenVoice

    Instant voice cloning by MIT and MyShell. Audio foundation model

    OpenVoice is a versatile instant voice cloning system that can replicate a speaker’s tone color from just a short audio clip and then generate speech in multiple languages. It is designed not only to match the timbre of the reference voice, but also to give granular control over style parameters such as emotion, accent, rhythm, pauses, and intonation. The model supports cross-lingual and even zero-shot cross-lingual voice cloning, so a speaker recorded in one language can be made to speak naturally in others. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 14
    Alan AI

    Alan AI

    In-App assistant SDK to build a multimodal conversational UX websites

    Quickly add voice to your app with the Alan Platform. Create an in-app voice assistant to enable human-like conversations and provide a personalized voice experience for every user. Alan is a conversational voice AI platform that lets you create an intelligent voice assistant for your app. It offers all the necessary tools to design, embed, and host your voice solutions.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Telegram Desktop

    Telegram Desktop

    Telegram Desktop messaging app

    Telegram Desktop is the official C++/Qt-based cross-platform client for Telegram, implementing the full Telegram API and MTProto protocol for secure messaging, voice/video calls, file sharing, and chat features. It provides message sync across devices, supports themes, stickers, bots, and is actively maintained.
    Downloads: 798 This Week
    Last Update:
    See Project
  • 16
    Vocode

    Vocode

    Build voice-based LLM agents. Modular + open source

    Vocode is an open source library that makes it easy to build voice-based LLM apps. Using Vocode, you can build real-time streaming conversations with LLMs and deploy them to phone calls, Zoom meetings, and more. You can also build personal assistants or apps like voice-based chess. Vocode provides easy abstractions and integrations so that everything you need is in a single library.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Alan AI for Android

    Alan AI for Android

    Assistant SDK to build a multimodal conversational UX for Android

    Quickly add voice to your app with the Alan Platform. Create an in-app voice assistant to enable human-like conversations and provide a personalized voice experience for every user. Alan is a conversational voice AI platform that lets you create an intelligent voice assistant for your app. It offers all the necessary tools to design, embed, and host your voice solutions.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Porcupine

    Porcupine

    On-device wake word detection powered by deep learning

    Build always-listening yet private voice applications. Porcupine is a highly-accurate and lightweight wake word engine. It enables building always-listening voice-enabled applications. It is using deep neural networks trained in real-world environments. Compact and computationally-efficient. It is perfect for IoT. Cross-platform. Arm Cortex-M, STM32, PSoC, Arduino, and i.MX RT.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    Alan AI for iOS

    Alan AI for iOS

    In-App assistant SDK to build a multimodal conversational UX for iOS

    Quickly add voice to your app with the Alan Platform. Create an in-app voice assistant to enable human-like conversations and provide a personalized voice experience for every user. Alan is a conversational voice AI platform that lets you create an intelligent voice assistant for your app. It offers all the necessary tools to design, embed, and host your voice solutions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    ElatoAI

    ElatoAI

    Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP

    ElatoAI is a real-time AI voice agent platform built around IoT hardware (ESP32) that enables continuous speech-to-speech conversations using state-of-the-art multimodal voice models with minimal latency and global performance via edge computing. The system integrates voice synthesis and recognition by connecting an ESP32 device through secure WebSockets to edge server functions written in Deno, allowing users to speak naturally with AI agents hosted through cloud APIs including OpenAI’s Realtime API, Gemini’s Live API, xAI’s Grok Voice Agent API, and others. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Mumble

    Mumble

    Mumble is an open-source, low-latency, high quality voice chat

    Mumble is an open-source, low-latency, high-quality voice chat software. There are two modules in Mumble; the client (mumble) and the server (murmur). The client works on Windows, Linux, FreeBSD, OpenBSD, and macOS, while the server should work on anything Qt can be installed on. Low-latency and high-quality voice-chat program written on top of Qt and Opus. Administrators appreciate Mumble for being able to self-host and have control over data security and privacy.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 23
    Pipecat

    Pipecat

    Framework for building real-time voice and multimodal AI agents

    Pipecat is an open source Python framework designed for building real-time voice and multimodal conversational AI agents. It provides developers with tools to orchestrate complex pipelines that combine speech recognition, language models, audio processing, and speech synthesis into a cohesive conversational system. Pipecat focuses on low-latency interactions so voice conversations with AI feel natural and responsive during live use.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Qwen3-TTS

    Qwen3-TTS

    Qwen3-TTS is an open-source series of TTS models

    ...Developers can customize voice output parameters like speed, pitch, and volume, and combine the TTS stack with other AI components.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 25
    VideoChat

    VideoChat

    Real-time voice interactive digital human

    VideoChat is a real-time voice-interactive “digital human” system that combines automatic speech recognition, large language models, text-to-speech, and talking-head generation into a single conversational pipeline. It supports both pure end-to-end voice solutions based on multimodal large language models (GLM-4-Voice feeding directly into talking-head generation) and a more traditional cascaded pipeline using ASR → LLM → TTS → talking head.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB