Voice Cloning Software for Windows

View 14 business solutions

Browse free open source Voice Cloning software and projects for Windows below. Use the toggles on the left to filter open source Voice Cloning software by OS, license, language, programming language, and project status.

  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Lyrebird

    Lyrebird

    Simple and powerful voice changer for Linux, written with Python & GTK

    Simple and powerful voice changer for Linux, written with Python & GTK.
    Downloads: 56 This Week
    Last Update:
    See Project
  • 2
    GPT-SoVITS

    GPT-SoVITS

    1 min voice data can also be used to train a good TTS model

    GPT‑SoVITS is a state-of-the-art voice conversion and TTS system that enables zero‑shot and few‑shot synthesis based on a short vocal sample (e.g., 5 seconds). It supports cross‑lingual speech synthesis across English, Chinese, Japanese, Korean, Cantonese, and more. It's powered by VITS architecture enhanced for few‑sample adaptation and real‑time usability.
    Downloads: 49 This Week
    Last Update:
    See Project
  • 3
    Coqui TTS

    Coqui TTS

    A deep learning toolkit for Text-to-Speech, battle-tested in research

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pre-trained models, tools for measuring dataset quality and is already used in 20+ languages for products and research projects. High-performance Deep Learning models for Text2Speech tasks. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings efficiently. Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS, ParallelWaveGAN, WaveGrad, WaveRNN) Fast and efficient model training. Detailed training logs on the terminal and Tensorboard. Support for Multi-speaker TTS. Efficient, flexible, and lightweight but feature complete Trainer API. Released and ready-to-use models. Tools to curate Text2Speech datasets underdataset_analysis. Utilities to use and test your models.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 4
    OpenVoice

    OpenVoice

    Instant voice cloning by MIT and MyShell. Audio foundation model

    OpenVoice is a versatile instant voice cloning system that can replicate a speaker’s tone color from just a short audio clip and then generate speech in multiple languages. It is designed not only to match the timbre of the reference voice, but also to give granular control over style parameters such as emotion, accent, rhythm, pauses, and intonation. The model supports cross-lingual and even zero-shot cross-lingual voice cloning, so a speaker recorded in one language can be made to speak naturally in others. Architecturally, OpenVoice separates “tone color” cloning from style control, which makes it easier to keep a consistent identity while flexibly changing prosody or language. The project provides open-weight models, inference code, and examples, making it suitable both for research and for building production voice experiences. It is actively developed by MyShell, which also integrates OpenVoice into broader agent and entertainment workflows.
    Downloads: 26 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Real-Time Voice Cloning

    Real-Time Voice Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Real-Time Voice Cloning is an influential deep-learning repository that demonstrates how to clone a voice from just a few seconds of audio and then generate arbitrary speech in that voice in near real time. It implements the SV2TTS pipeline (“Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis”) in three stages: a speaker encoder, a synthesizer, and a vocoder. In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that captures voice characteristics; this embedding is then used by a Tacotron-style synthesizer to generate spectrograms from text, which a WaveRNN-based vocoder finally turns into audio. The repo includes both a command-line demo and a graphical “toolbox” application where you can load reference voices, type text, and hear the synthesized results interactively. It also provides scripts for preprocessing datasets (such as LibriSpeech), training each of the three components.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    Parakeet

    Parakeet

    PAddle PARAllel text-to-speech toolKIT

    PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN) Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle dynamic graph and includes many influential TTS models. In order to facilitate exploiting the existing TTS models directly and developing the new ones, Parakeet selects typical models and provides their reference implementations in PaddlePaddle. Further more, Parakeet abstracts the TTS pipeline and standardizes the procedure of data preprocessing, common module sharing, model configuration, and the process of training and synthesis. The models supported here include Text FrontEnd, end-to-end Acoustic models and Vocoders.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    elevenlabs-api

    elevenlabs-api

    elevenlabs-api is an open source Java wrapper around the ElevenLabs

    Elevenlabs-api is an open-source Java wrapper around the ElevenLabs Voice Synthesis and Cloning Web API. Compiled JARs are available via the Releases tab. To access your ElevenLabs API key, head to the official website, you can view your xi-API-key using the 'Profile' tab on the website. To set up your ElevenLabs API key, you must register it with the ElevenLabsAPI Java API. For any public repository security, you should store your API key in an environment variable, or external from your source code. The most realistic and versatile AI speech software, ever. Eleven brings the most compelling, rich and lifelike voices to creators and publishers seeking the ultimate tools for storytelling. Generate top-quality spoken audio in any voice and style with the most advanced and multipurpose AI speech tool out there. Our deep learning model renders human intonation and inflections with unprecedented fidelity and adjusts delivery based on context.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    PaddleSpeech

    PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model

    PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks in speech and audio, with state-of-art and influential models. Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. Low barriers to install, CLI, Server, and Streaming Server is available to quick-start your journey. We provide high-speed and ultra-lightweight models, and also cutting-edge technology. We provide production ready streaming asr and streaming tts system. Our frontend contains Text Normalization and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi). Moreover, we use self-defined linguistic rules to adapt Chinese context.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Mocking Bird

    Mocking Bird

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    MockingBird is an open-source voice cloning and real-time speech generation toolkit that lets you clone a speaker’s voice from a short audio sample (reportedly as little as 5 seconds) and then synthesize arbitrary speech in that voice. It builds on deep-learning based TTS / voice-cloning technology (in the lineage of projects such as Real-Time-Voice-Cloning), but extends it with support for Mandarin Chinese and multiple Chinese speech datasets — broadening its applicability beyond English. The codebase is implemented in Python (with PyTorch) and includes modules for encoder, synthesizer, vocoder, preprocessing, and inference, as well as demo scripts and a web-server interface for easier experimentation or deployment. MockingBird supports both using pretrained models and training your own synthesizer (with custom datasets), giving flexibility for voice-cloning or custom-voice synthesis depending on your needs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Add Two Lines of Code. Get Full APM. Icon
    Add Two Lines of Code. Get Full APM.

    AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

    Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.
    Start Free
  • 10
    lora-svc

    lora-svc

    Singing voice change based on whisper, lora for singing voice clone

    singing voice change based on whisper, and lora for singing voice clone. You will feel the beauty of the code from this project. Uni-SVC main branch is for singing voice clone based on whisper with speaker encoder and speaker adapter. Uni-SVC main target is to develop lora for SVC. With lora, maybe clone a singer just need 10 stence after 10 minutes train. Each singer is a plug-in of the base model.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Multilingual Speech Synthesis

    Multilingual Speech Synthesis

    An implementation of Tacotron 2 that supports multilingual experiments

    This repository provides synthesized samples, training and evaluation data, source code, and parameters for the paper One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech. It contains an implementation of Tacotron 2 that supports multilingual experiments and that implements different approaches to encoder parameter sharing. It presents a model combining ideas from Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning, End-to-End Code-Switched TTS with Mix of Monolingual Recordings, and Contextual Parameter Generation for Universal Neural Machine Translation. We provide data for comparison of three multilingual text-to-speech models. The first shares the whole encoder and uses an adversarial classifier to remove speaker-dependent information from the encoder. The second has separate encoders for each language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Voice Cloning App

    Voice Cloning App

    A Python/Pytorch app for easily synthesising human voices

    A Python/Pytorch app for easily synthesizing human voices. If you are using a language other than English you can add it to the app. Firstly, you'll need to find a deep speech model for your language by going to coqui. You'll then need to download the model.pbmm and alphabet.txt files for your language. Requires Windows 10 or Ubuntu 20.04+ operating system, 5GB+ Disk space, and NVIDIA GPU with at least 4GB of memory & driver version 456.38+ (optional). Automatic dataset generation (with support for subtitles and audiobooks) Additional language support. Local & remote training. Easy train start/stop. Data importing/exporting.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    VoiceOver

    VoiceOver

    VoiceOver is a web application that allows you to transcribe audio

    VoiceOver is a web application that allows you to transcribe English audio and listen to it in another voice. Choose a source, an audio file (.wav) in English only. Transcribe audio, several algorithms will take care of it. Listen to the generated transcription, a man or a woman, it's up to you!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    VoiceSmith

    VoiceSmith

    [WIP] VoiceSmith makes training text to speech models easy

    VoiceSmith makes it possible to train and infer on both single and multispeaker models without any coding experience. It fine-tunes a pretty solid text to speech pipeline based on a modified version of DelightfulTTS and UnivNet on your dataset. Both models were pretrained on a proprietary 5000 speaker dataset. It also provides some tools for dataset preprocessing like automatic text normalization. Windows (only CPU supported currently) or any Linux based operating system. If you want to run this on macOS you have to follow the steps in build from source in order to create the installer. This is untested since I don't currently own a Mac. NVIDIA GPU with CUDA support is highly recommended, you can train on CPU otherwise but it will take days if not weeks. VoiceSmith currently uses a two-stage modified DelightfulTTS and UnivNet pipeline.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    vocoder_chung
    vocoder chung is a small educational vocoder using discrete fourier transform FFT spectrum written in easy fast compiled freebasic . (24/12/2019) uses fast and accurate FFTdll.dll (28/03/2020) algorythmic voice cloning / change / morphing experiment added
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB