Showing 22 open source projects for "audio enhancement"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    NovaSR

    NovaSR

    A lightning fast audio upsampler

    ...NovaSR is especially valuable for post-processing tasks in speech enhancement, TTS pipelines, and dataset restoration where low sampling rates degrade perceived audio clarity; the minimal model size also makes it suitable for edge and embedded use cases where memory is at a premium. Its performance can reach thousands of times realtime on modern GPUs, allowing massive audio batches to be processed with negligible compute overhead.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    HunyuanCustom

    HunyuanCustom

    Multimodal-Driven Architecture for Customized Video Generation

    HunyuanCustom is a multimodal video customization framework by Tencent Hunyuan, aimed at generating customized videos featuring particular subjects (people, characters) under flexible conditions, while maintaining subject/identity consistency. It supports conditioning via image, audio, video, and text, and can perform subject replacement in videos, generate avatars speaking given audio, or combine multiple subject images. The architecture builds on HunyuanVideo, with added modules for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    WanGP

    WanGP

    AI video generator optimized for low VRAM and older GPUs use

    ...It also integrates a wide range of utilities such as prompt enhancement, mask editing, motion design, and extraction tools for pose, depth, and flow data to support advanced video workflows.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 5
    VoxCPM

    VoxCPM

    TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

    VoxCPM is a tokenizer-free text-to-speech system that models speech in a continuous space, aiming for extremely realistic, context-aware synthesis and true-to-life zero-shot voice cloning. Instead of converting speech into discrete tokens, it uses an end-to-end diffusion-autoregressive architecture built on the MiniCPM-4 backbone, combining hierarchical language modeling, finite scalar quantization (FSQ), and local Diffusion Transformers. This design helps decouple semantic and acoustic...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    AzioVoice Recorder

    AzioVoice Recorder

    AzioVoice Recorder is an app designed to record audio

    Description Starting from version 1.3.1.0, the project has been renamed to AzioVoice Recorder and is officially published in the Microsoft Store at: https://apps.microsoft.com/detail/9PP795T0KSFP The app supports recording in WAV format with adjustable settings like sample rate, bit depth, and channels. It includes several audio filters for basic enhancement and features a simple file explorer for playback and management. Themes can be switched between dark and light, with settings saved persistently. Core Functionality Multi-Device Support: Record from any available audio input device High-Quality Recording: WAV format output with configurable audio settings File Management: Browse, play, and delete recordings with metadata display Customizable Settings: Configure sample rate, bit depth, and channel count
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    EQUALIZER-MASTER
    🚀 Getting Started 1. Prerequisites Download and Install Equalizer APO. During installation, select your primary audio device (Speakers/Headphones). Restart your computer. 2. Installation / Running You can run this app in two ways: Option A: Portable (Recommended) Download the latest release. Extract the folder. Run Equalizer Master.exe. Option B: Setup Run Equalizer Master Setup.exe to install it to your system. 3. First Setup When you first open the app, it...
    Leader badge
    Downloads: 37 This Week
    Last Update:
    See Project
  • 8
    RSPpmp3

    RSPpmp3

    multi-media player updated yearly

    Win64/Win32 DLL and C# sample project (soon C project too (possible)) to play all audio and video files that are supported by Libav library (this release is using the new Libav libraries available (14/nov/2022)), also now a stripped version with only standard audio (Mp3, Ogg, AAC, Opus), is available, it is marked as no Libav version Also after 25/nov/2024 some of the wav effects are being converted to Edge plugins, with the valuable help of Copilot, the license of the plugins is MIT With the help of Microsoft Copilot we are adding Lucid Mode video enhancement...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    ...Demucs supports GPU-accelerated inference and can process multi-channel audio with chunked streaming for real-time or batch operation. It also provides training scripts and utilities to fine-tune on custom datasets, along with remixing and enhancement tools.
    Downloads: 83 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Music Source Separation

    Music Source Separation

    Separate audio recordings into individual sources

    Music Source Separation is a PyTorch-based open-source implementation for the task of separating a music (or audio) recording into its constituent sources — for example isolating vocals, instruments, bass, accompaniment, or background from a mixed track. It aims to give users the ability to take any existing song and decompose it into separate stems (vocals, accompaniment, etc.), or to train custom separation models on their own datasets (e.g. for speech enhancement, instrument isolation, or other audio-separation tasks). ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    VoiceFixer

    VoiceFixer

    General Speech Restoration

    ...Unlike many single-purpose noise reduction tools, VoiceFixer targets a “general speech restoration” problem (GSR), capable of handling multiple types of distortions at once, which makes it suitable for old recordings, phone-call audio, amateur voice recordings, or archival media. Evaluations show that VoiceFixer significantly improves both objective and subjective audio quality compared to baseline speech-enhancement methods.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    eqMac

    eqMac

    System Audio Equalizer for macOS - Parametric EQ & Volume Mixer

    eqMac is a System Audio Equalizer and processor for Apple macOS platform.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    Denoiser

    Denoiser

    Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)

    Denoiser is a real-time speech enhancement model operating directly on raw waveforms, designed to clean noisy audio while running efficiently on CPU. It uses a causal encoder-decoder architecture with skip connections, optimized with losses defined both in the time domain and frequency domain to better suppress noise while preserving speech. Unlike models that operate on spectrograms alone, this design enables lower latency and coherent waveform output.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    NanoDSP Open Source DSP

    NanoDSP Open Source DSP

    Audio Enhancer for Windows and Embedded Platform

    NanoDSP is designed for embedded operation and aims for low CPU load. Main functions 1. Bass amplification using a quadratic curve Generates odd and even harmonics, and uses human auditory psychology missing fundamentals to create the illusion of bass enhancement. By using a quadratic curve for the bass component, distortion is created and harmonics are generated to amplify the bass. 2. Separation of bass and Mid, treble using moving average By using the moving average, the...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    PlusV

    PlusV

    An Open Source alternative to SBR (Spectral band replication)

    What is PlusV? With traditional MP3, a typical Near CD Quality audio file has been encoded with a data rate of 128 kbits/s. While this is ok for people with big hard disks and fast Internet connections, this data speed has clearly been a bottleneck for people using modems or storing their music into 32 or 64 MB portable player FLASH cards. PlusV is a brand new audio compression enhancement technology that allows audio files to be compressed in as little as 64 or even 48 kbits/s. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16

    Weasel Audio Library

    A JavaScript library for playing Ultimate Soundtracker music modules.

    Web Enabled Audio and Sound Enhancement Library, or Weasel for short! A GPL 3 JavaScript library for playing music from Amiga Soundtracker modules, supporting: * Ultimate Soundtracker * The Jungle Command Soundtracker 2 * Def Jam Soundtracker 3 * DOC Soundtracker 9 * DOC Soundtracker 2.2 * Spreadpoint Soundtracker 2.3 * Spreadpoint Soundtracker 2.5 * Noisetracker 1.1 * Noisetracker 2.0 * Protracker 1.1 & 2 * TakeTracker/FastTracker 1
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    Distant Speech Recognition

    Beamforming and Speech Recognition Toolkit

    BTK contains C++ and Python libraries that implement speech processing and microphone array techniques such as speech feature extraction, speech enhancement, speaker tracking, beamforming, dereverberation and echo cancellation algorithms. The Millennium ASR provides C++ and python libraries for automatic speech recognition. The Millennium ASR implements a weighted finite state transducer (WFST) decoder, training and adaptation methods. These toolkits are meant for facilitating research and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18

    Rpg.NET

    Performance enhancement API for RPG Maker XP, VX, VXA.

    Rpg.NET is an API built using the .NET Framework for both improving performance and enhancing the RPG Maker series (XP, VX, and VXA). Contained within are various functions and classes that extend the ability of RPG Maker in areas such as graphics, audio, and Windows API interop. Included is a Ruby script that is the wrapper around the library, so it can be used as any other script within your game.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 19
    Real time stereo audio enhancement processor utilizing HRIRs, real time stereo decomposition, image widening, noise & harmonic generation to enhance & produce virtual audio sources. Processing is performed by a parallel FFT OLA STFT processor.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PlayThemAll Media Player
    PlayThemAll Media Player. This project is essentially an attempt to create a simple GUI-based media player which is able to play media files of almost all the prevalent formats. It has been made entirely in GNU/C++ using Qt 4. Image : http://bit.ly/DQnsQ
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    This is a web application written in C# to provide universal transcoding, transformation and enhancement solution for image, rich medias and documents. It exposes all the services via web.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Maestro

    Maestro

    Offline AI orchestration with a modern UI & model integration

    LM-Kit Maestro is a powerful offline desktop application that lets you orchestrate AI agents directly on your local machine using a modern, clean interface. Built on the robust LM-Kit.NET framework with .NET MAUI and Razor, Maestro enables you to create personalized chatbots and conversational agents while ensuring your data remains secure with no external transfers. Evaluate each model’s performance based on your hardware and switch seamlessly between multiple models during a single...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB