Showing 29 open source projects for "more voices"

View related business solutions
  • Add Two Lines of Code. Get Full APM. Icon
    Add Two Lines of Code. Get Full APM.

    AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

    Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.
    Start Free
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 1
    kokoro-onnx

    kokoro-onnx

    TTS with kokoro and onnx runtime

    ...The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps. It supports multiple languages and voices, with a curated voice list and configuration via a VOICES file hosted alongside the models. The package is distributed on PyPI, meaning you can integrate it directly into applications or scripts using standard Python tooling. It also recommends pairing with an external G2P package to improve pronunciation quality, especially for more complex languages or names, and is licensed under permissive MIT and Apache-style licenses.
    Downloads: 53 This Week
    Last Update:
    See Project
  • 2
    Speech Note

    Speech Note

    Speech Note Linux app. Note taking, reading and translating

    Speech Note is a Linux desktop and Sailfish OS application for taking, reading, and translating notes with integrated offline speech technology. It combines speech-to-text, text-to-speech, and machine translation in a single interface, allowing users to dictate notes, listen back to them, and translate them without ever sending data to the cloud. All processing is done locally, which means audio, text, and translations never leave the device, emphasizing strong privacy guarantees. The...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    RHVoice

    RHVoice

    Free open source speech synthesizer for Russian and other languages

    RHVoice is a free and open-source multilingual speech synthesizer. Its developers hope to give more visually impaired people the ability to use a good free synthesis voice reading in their native language with their screen reader. We are especially interested in supporting those languages for which there are currently no good voices that could be used with a screen reader. The creator of RHVoice, Olga Yakovleva, is blind herself.
    Downloads: 46 This Week
    Last Update:
    See Project
  • 4
    ChatTTS

    ChatTTS

    A generative speech model for daily dialogue

    ChatTTS is an open-source conversational text-to-speech model optimized for dialogue, developed by 2Noise. Trained on 100,000+ hours of English and Chinese conversation data, it excels at generating expressive prosody—pauses, interjections, laughter—for more natural-sounding speech synthesis in assistant and chatbot applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    ebook2audiobook

    ebook2audiobook

    Generate audiobooks from e-books, voice cloning & 1107+ languages

    ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes. The tool supports a wide array of...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 6
    Auto Synced & Translated Dubs

    Auto Synced & Translated Dubs

    Automatically translates the text of a video based on a subtitle file

    Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken segment and synthesizes audio via neural TTS services, producing one audio clip per subtitle entry. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    peon-ping

    peon-ping

    Warcraft III Peon voice notifications (+ more!) for Claude Code

    Peon-ping is a quirky utility that brings fun and practical voice notifications to your development workflow by using Warcraft III peon-style sound effects whenever significant events occur in your code editor or terminal. The project is built around the idea of reducing cognitive load by audibly alerting you when processes finish, tests fail, or language models complete responses, helping you stay focused without constantly watching the screen. It integrates with Claude Code, Codex, and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    ...It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all necessary dependencies, so users can focus on experimenting with voices instead of managing tooling. It offers both a Gradio backend and an optional React frontend, which can be accessed on separate ports and even run inside Docker for more reproducible deployments. An extension system lets you enable extra models and tools, install community extensions from a catalog, and manage them via a dedicated GUI or CLI extension manager.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    FastKoko

    FastKoko

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

    ...The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple languages and voicepacks and allows phoneme based generation for more accurate pronunciation and prosody. The server also offers per-word timestamped captions, which makes it useful for creating subtitles or aligning audio with text. A built in web UI, API documentation, and debug endpoints for monitoring system status help users explore voices, test requests, and integrate the service into larger systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Step-Audio

    Step-Audio

    Open-source framework for intelligent speech interaction

    ...The design moves beyond traditional separate-component pipelines (ASR → text model → TTS), instead offering a multimodal model that ingests speech or audio and produces speech accordingly, enabling natural dialogue, voice cloning, and expressive speech synthesis. Through its architecture, Step-Audio supports multilingual interaction, dialects, emotional tones (joy, sadness, etc.), and even more creative speech styles (like rap or singing), while allowing dynamic control over speech characteristics. It also provides a “generative data engine,” which can produce synthetic speech data (cloning voices, varying style) to support TTS training.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    StyleTTS 2

    StyleTTS 2

    Towards Human-Level Text-to-Speech through Style Diffusion

    ...It extends the original StyleTTS idea by introducing a style diffusion model that can sample rich, realistic speaking styles conditioned on reference speech, allowing highly expressive and diverse prosody. The architecture uses a two-stage training process and leverages an auxiliary speech language model to guide generation toward more natural and coherent utterances. StyleTTS2 supports both single-speaker and multi-speaker configurations, with the ability to sample or transfer styles from reference audio, making it powerful for expressive TTS and character voices. The repository includes training scripts, configuration files, and pre-trained auxiliary modules such as a text aligner, pitch extractor, and PL-BERT-based linguistic encoder.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    SpeakFlow-TTS

    SpeakFlow-TTS

    Multilingual Text-to-Speech (TTS)

    Excited to present SpeakFlow - an intuitive desktop application for Text-to-Speech (TTS) conversion! It allows you to easily transform entered text into high-quality audio files, using natural voices in many languages. Key features of SpeakFlow: Multilingual support: Choose from a wide range of languages and voices (Ukrainian, English, German, Russian, Polish, French, Italian, Spanish, Portuguese, and more). Simple and intuitive interface: Designed for quick and convenient audio generation. Audio Playback: Instantly listen to and download the generated text in MP3 format.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    EmotiVoice

    EmotiVoice

    Multi-Voice and Prompt-Controlled TTS Engine

    EmotiVoice is a multi-voice, prompt-controlled text-to-speech engine designed to generate highly expressive speech across thousands of voices. It supports both English and Chinese and ships with over 2,000 preset voices, making it suitable for everything from characters and virtual anchors to narration and dialogue. The core idea is prompt-based emotional and style control: you can ask the engine to speak “happy,” “sad,” “excited,” or with other high-level style prompts that shape prosody,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Bert-VITS2

    Bert-VITS2

    VITS2 backbone with multilingual-bert

    ...The repository includes everything needed to train, fine-tune, and run the model, from configuration files to preprocessing scripts, spectrogram utilities, and training entrypoints for multi-GPU and multi-node setups. It provides emotional modeling through “emo embeddings,” allowing voices to be conditioned on different affective states during synthesis. Releases include optimizations for Japanese and English alignment, expanded training data, spec caching and pre-generation tools, as well as ONNX export for more lightweight inference deployments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Pearl Desktop (PDE) 12

    Pearl Desktop (PDE) 12

    The Stable Solid Multimedia Workhorse Powerful OS with Eye Candy

    Pearl Linux Desktop (PDE) 12 is based on Ubuntu 24.04 LTR. This is your go to work horse daily driver for the advanced as well as the new Linux user. We say YES to APT, Flatpak and Appimages but NO to Snaps. Featuring Firefox-ESR instead of Firefox, Pulseasudio by default however Install package pearl-pipewire-config from our REPO to have pipewire as your default sound server. Very Smooth and Easy Configs. Compiz is the default Window Manager and you may switch window managers without...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 16
    MuseScore

    MuseScore

    Free music notation & composition software

    MuseScore is a free and open-source music notation software designed for composers, arrangers, educators, and musicians of all levels. It allows users to write, edit, and print professional-quality sheet music with no limitations or subscription fees. MuseScore Studio supports a wide range of instruments and ensembles, from solo piano and guitar to full orchestras and choirs. The software is easy to learn while still offering powerful tools for advanced notation and score layout. With...
    Downloads: 40 This Week
    Last Update:
    See Project
  • 17
    Coqui TTS

    Coqui TTS

    A deep learning toolkit for Text-to-Speech, battle-tested in research

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pre-trained models, tools for measuring dataset quality and is already used in 20+ languages for products and research projects. High-performance Deep Learning models for Text2Speech tasks. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 18
    AWA-Core

    AWA-Core

    Full application for factory, process engineer and Automation..

    IT'S HERE-----FINALLY. AWA-Core 2026 is here with a totally new architecture. The core is now in Client/Server architecture and open to other applications (including yours). New interfaces for the server and client sides. Please, go to our youTube channel to see many tutorials about this new release. Don't waste your time trying things and clicking everywhere. Wait for our tutorials, install AWA-Core on a single PC or in a complete C/S architecture (Servers provided) and run...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    MaryTTS

    MaryTTS

    An open-source, multilingual text-to-speech synthesis system

    ...It is now maintained by the Multimodal Speech Processing Group in the Cluster of Excellence MMCI and DFKI. As of version 5.2, MaryTTS supports German, British and American English, French, Italian, Luxembourgish, Russian, Swedish, Telugu, and Turkish; more languages are in preparation. MaryTTS comes with toolkits for quickly adding support for new languages and for building unit selection and HMM-based synthesis voices.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    Read Aloud

    Read Aloud

    An awesome browser extension that reads aloud webpage content

    Read Aloud is a browser extension for Chrome, Firefox, and other Chromium-based browsers that converts webpage text to audio using text-to-speech technology. It is designed to work on a wide variety of sites, including news, blogs, online textbooks, course materials, fanfiction, and more. The extension targets users who prefer listening over reading, as well as people with dyslexia, other learning disabilities, or eye strain, and children learning to read. Read Aloud lets users choose from multiple voices: built-in browser voices, plus premium cloud voices from providers such as Google Wavenet, Amazon Polly, IBM Watson, and Microsoft. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    TensorFlowTTS

    TensorFlowTTS

    Real-Time State-of-the-art Speech Synthesis for Tensorflow 2

    TensorFlowTTS is a state-of-the-art, open-source speech synthesis library built on TensorFlow 2. It offers a variety of architectures for text-to-speech, including classic and modern models such as Tacotron‑2, FastSpeech / FastSpeech2, and neural vocoders like MelGAN and Multiband‑MelGAN. Because it’s based on TensorFlow 2, it can leverage optimizations such as fake-quantization aware training and pruning — which allow models to run faster than real time and to be deployable on mobile or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Secure File Vault

    Secure File Vault

    A very secure file vault for private files to avoid hackers

    ...It uses a combination of veracrypt, winrar, 7zip to encrypt your files to ensure that they are secure and not be seen by someone. Once you start to create the file vault, your password gets hashed a million times (only takes 3 seconds) then base 64 3 times and hashed 1 more time, it uses a combination of SHA256, SHA512 and lot of other secret algorithms to make it really secure. This program also have voices for people that have trouble seeing the screen. The purpose is to make this program accessible to everyone because privacy is everyones right and should be respected. SERIAL NUMBERS: Installer Password: password Trial Extension Key: diamond sword Permenant Registration Key: 1. 746-609 2. 778-499 This project is fully open source
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    JNIZ music notation audio to midi

    JNIZ music notation audio to midi

    music composition and notation software, audio to midi converter

    The Jniz project is stopped. The new Web version is now JnizWeb hosted on Gitlab (under construction): https://gitlab.com/jniz70/jnizweb/ Demo: https://jniz70.gitlab.io/jnizweb/ Jniz is a piece of software designed for musicians as a support tool to the musical composition. It allows you to build and to harmonize several voices according to the rules of classical harmony. Sound/audio-to-Midi converter: real-time conversion of any monophonic sound (voice, instrument etc.) into...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Silent Meditation

    Silent Meditation

    Meditation timers for your MP3 player

    This project provides a set of silent audio files with chimes at timed intervals. It aims to replace overly-complicated meditation timers and mobile apps that serve the same purpose but with more nuisance and overhead. These are not guided meditation sessions. There are no voices. Simply copy any of these files to your phone or MP3 player and begin your silent meditation session. No longer will you need to watch the clock. A chime will tell you when the time is up. You can even combine a number of sessions into a playlist to design your own custom session.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Audovia

    Audovia

    Database application for making music using JFugue MusicStrings

    ...You can use notes from C0 to G10, corresponding to MIDI values 0 to 127. Middle C is C5. Songs can be exported to MIDI or WAV for music processing or publishing. Please visit the Audovia website for more information: https://songbase.github.io/
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB