Showing 59 open source projects for "music ear training"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    tksolfege ear training program

    tksolfege ear training program

    Music ear training exercises

    Tksolfege is an ear training program for learning to recognize chords, intervals, perform rhythm dictation, solfege dictation and singing solfege sequences. As you will discover, it is not an easy program to install and setup on your computer. You will also require to install the tcl/tk interpreter, fluidsynth, and at least one soundfont file. On Windows 11, you may also need to configure the operating system to show the file extension and the hidden directories.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 2
    ACE-Step 1.5

    ACE-Step 1.5

    The most powerful local music generation model

    ACE-Step 1.5 is an advanced open-source foundation model for AI-driven music generation that pushes beyond traditional limitations in speed, musical coherence, and controllability by innovating in architecture and training design. It integrates cutting-edge generative techniques—such as diffusion-based synthesis combined with compressed autoencoders and lightweight transformer elements—to produce high-quality full-length music tracks with rapid inference times, capable of generating a complete song in seconds on modern GPUs while remaining efficient enough to run on consumer-grade hardware with minimal memory requirements. ...
    Downloads: 59 This Week
    Last Update:
    See Project
  • 3
    deepjazz

    deepjazz

    Deep learning driven jazz generation using Keras & Theano

    deepjazz is a deep learning project that generates jazz music using recurrent neural networks trained on MIDI files. The repository demonstrates how machine learning can learn musical structure and produce original compositions. It uses the Keras and Theano libraries to build a two-layer Long Short-Term Memory network capable of learning temporal patterns in music. The system analyzes musical sequences from an input MIDI file and then generates new musical notes that follow similar stylistic patterns. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    HeartMuLa

    HeartMuLa

    A Family of Open Sourced Music Foundation Models

    HeartMuLa is the open-source library and reference implementation for the HeartMuLa family of music foundation models, designed to support both music generation and music-related understanding tasks in a cohesive stack. At the center is HeartMuLa, a music language model that generates music conditioned on inputs like lyrics and tags, with multilingual support that broadens the range of lyric-driven use cases. The project also includes HeartCodec, a music codec optimized for high reconstruction fidelity, enabling efficient tokenization and reconstruction workflows that are critical for training and generation pipelines. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    MuseGAN

    MuseGAN

    An AI for Music Generation

    MuseGAN is a deep learning research project designed to generate symbolic music using generative adversarial networks. The system focuses specifically on generating multi-track polyphonic music, meaning that it can simultaneously produce multiple instrument parts such as drums, bass, piano, guitar, and strings. Instead of generating raw audio, the model operates on piano-roll representations of music, which encode notes as time-pitch matrices for each instrument track. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    audioFlux

    audioFlux

    A library for audio and music analysis, feature extraction

    ...It can be provided to deep learning networks for training and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) ASR, etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...
    Downloads: 95 This Week
    Last Update:
    See Project
  • 9
    Audiogen Codec

    Audiogen Codec

    48khz stereo neural audio codec for general audio

    AGC (Audiogen Codec) is a convolutional autoencoder based on the DAC architecture, which holds SOTA. We found that training with EMA and adding a perceptual loss term with CLAP features improved performance. These codecs, being low compression, outperform Meta's EnCodec and DAC on general audio as validated from internal blind ELO games. We trained (relatively) very low compression codecs in the pursuit of solving a core issue regarding general music and audio generation, low acoustic quality, and audible artifacts, which hinder industry use for these models. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Qwen-Audio

    Qwen-Audio

    Chat & pretrained large audio language model proposed by Alibaba Cloud

    Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance without task-specific fine‐tuning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    YuE

    YuE

    Open source AI model for generating full songs from lyrics prompts

    ...YuE introduces a family of models built on large language model architectures that process music generation as a sequence prediction task. YuE also incorporates techniques such as track-decoupled prediction and progressive conditioning to help manage complex audio signals and maintain consistency throughout long compositions. It includes inference scripts, prompt examples, evaluation tools, and training components that enable researchers and developers to experiment with AI-based music.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    pyAudioAnalysis

    pyAudioAnalysis

    Python Audio Analysis Library: Feature Extraction, Classification

    pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio segmentation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    WavTokenizer

    WavTokenizer

    SOTA discrete acoustic codec models with 40/75 tokens per second

    WavTokenizer is a state-of-the-art discrete acoustic codec designed specifically for audio language modeling, capable of compressing 24 kHz audio into just 40 or 75 tokens per second while preserving high perceptual quality. It is built to represent speech, music, and general audio with extremely low bitrate, making it ideal as a front-end for large audio language models like GPT-4o and similar architectures. The model uses a single-quantizer design together with temporal compression to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Amphion

    Amphion

    Toolkit for audio, music, and speech generation

    Amphion is a toolkit from OpenMMLab dedicated to audio, music, and speech generation, aimed at both reproducible research and helping newcomers get started in generative audio. It provides standardized implementations and recipes for classic and state-of-the-art generative models in audio, including TTS, music generation, and voice conversion. A distinctive feature of Amphion is its emphasis on visualization: it offers interactive visualizations of model architectures and generation processes, making it easier to understand how complex generative audio models work. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    BikeControl

    BikeControl

    Do virtual gear shifting (and more) in any rider app

    ...This allows you to perform actions like virtual gear shifting, steering, changing workout intensity, controlling music, etc., even if the trainer app doesn’t natively support those controllers. It works on different platforms: on Android, it uses the AccessibilityService API to send simulated touches to whatever app window is active; on desktop (Windows/macOS), it can emulate key/mouse input via configurable keymaps so users can tailor controls.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 16
    EarQuiz Frequencies

    EarQuiz Frequencies

    Software for technical ear training on equalization

    EarQuiz Frequencies is a software for ear training on equalization. Its goal is to help musicians, audio professionals, hobbyists and students learn how to hear frequency bands. Available for Windows 10, 11 (x64), macOS 11 or higher (both for Intel and Apple Silicon) and Linux. This application is based on (and deeply inspired by) the world-renowned Golden Ears method of David Moulton, whose course is half dedicated to building this essential critical listening skill. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    DiffRhythm

    DiffRhythm

    Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation

    DiffRhythm is an open-source, diffusion-based model designed to generate full-length songs. Focused on music creation, it combines advanced AI techniques to produce coherent and creative audio compositions. The model utilizes a latent diffusion architecture, making it capable of producing high-quality, long-form music. It can be accessed on Huggingface, where users can interact with a demo or download the model for further use. DiffRhythm offers tools for both training and inference, and its flexibility makes it ideal for AI-based music production and research in music generation.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    byzorgan

    byzorgan

    Specialized sound synthesizer with Byzantine Church music scales

    This software integrates a small, specialized synthesizer and vocal processor. It can be used to learn Byzantine Church singing. You can play from the keyboard, mouse or touch screen. MIDI input is also available. Voice functions include: pitch highlighting, synthesizer control by voice, pitch correction and voice-to-ison conversion. On the screen there are labels with symbols of Byzantine notes. There is a metronome. The program is oriented on the Chrysanthos tuning of the diatonic scale:...
    Downloads: 36 This Week
    Last Update:
    See Project
  • 19
    Youtube Downloader & Splitter - Ad-free

    Youtube Downloader & Splitter - Ad-free

    Another training development

    Warning please update yt-dlp module manully!!!! This application development started in mid 2020. Easy to use and one of the fastest method to get youtube music videos to mp3 Please use it fairly and use it legal way Qt6.6 based for Linux dotnet (8) +Avalonia UI for Linux develops still in progress.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Audio AI Timeline

    Audio AI Timeline

    A timeline of the latest AI models for audio generation

    Audio AI Timeline is a curated project that organizes the development of audio-related artificial intelligence into a structured and accessible historical timeline. Rather than functioning as a model training framework, it serves as an informational resource that maps key papers, systems, models, datasets, and milestones across areas such as speech synthesis, music generation, audio understanding, source separation, and general audio machine learning. The project helps users understand how major techniques and ideas evolved over time, making it especially useful for researchers, students, and practitioners who want a broad overview of the field without digging through scattered references. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    ...The repository includes pretrained models for common tasks such as isolating vocals, drums, bass, and accompaniment from stereo music, achieving state-of-the-art results in benchmarks like MUSDB18. Demucs supports GPU-accelerated inference and can process multi-channel audio with chunked streaming for real-time or batch operation. It also provides training scripts and utilities to fine-tune on custom datasets, along with remixing and enhancement tools.
    Downloads: 91 This Week
    Last Update:
    See Project
  • 22
    AudioEqualizer
    Introducing AudioEqualize: Elevate Your Audio Experience! AudioEqualize isn't just your average volume adjustment tool; it's a sophisticated audio wizard that goes beyond simple peak amplitude normalization. Designed to enhance your music library, AudioEqualize meticulously analyzes and precisely tunes your MP3 files to a target volume of your choice. Here's why it's the ultimate choice for audio enthusiasts:
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Bend My Ear

    Bend My Ear

    Easy Training for The Hearing Impaired

    Bend My Ear is a copy of the sentences module from the PC version Angel Sound. angelsound.tigerspeech.com Angel has created all it's other Apps on Iphone and Ipad. I decided to transfer the Sentences Module to run on the Android Operating System. I have added 3 background sounds and the ability to use the original sound recordings or use the sound provided by the default voice on the Phone. The idea is to provide the ability to do some Rehab whenever You have a spare moment and your...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    audioFlux

    A library for audio and music analysis, feature extraction.

    audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training, and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) and ASR etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DiffSinger

    DiffSinger

    Singing Voice Synthesis via Shallow Diffusion Mechanism

    ...The core idea is to view generation of a sung voice (mel-spectrogram) as a diffusion process: starting from noise, the model iteratively “denoises” while being conditioned on a music score (lyrics, pitch, musical timing). This avoids some of the typical problems of prior SVS models — like over-smoothing or unstable GAN training — and produces more realistic, expressive, and natural-sounding singing. The method introduces a “shallow diffusion” mechanism: instead of diffusing over many steps, generation begins at a shallow step determined adaptively, which leverages prior knowledge learned by a simple mel-spectrogram decoder and speeds up inference.
    Downloads: 40 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB