Showing 355 open source projects for "audio convert"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    HandBrake

    HandBrake

    A open source video to convert video from any format to modern codecs

    HandBrake is an open-source, GPL-licensed, multiplatform, multithreaded video transcoder, available for MacOS X, Linux and Windows.
    Downloads: 383 This Week
    Last Update:
    See Project
  • 2
    NeuralNote

    NeuralNote

    Audio Plugin for Audio to MIDI transcription using deep learning

    NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. ...
    Downloads: 43 This Week
    Last Update:
    See Project
  • 3
    File Converter

    File Converter

    Simple tool which allows you to convert and compress files

    File Converter is a minimalist open‑source tool (GPL‑3.0) that lets users convert and compress one or multiple files directly via the Windows Explorer context menu. It integrates with powerful back-end utilities—FFmpeg, ImageMagick, Ghostscript—to handle a broad range of media and document transformations. File Converter is a personal open source project started in 2014. I have put hundreds of hours adding, refining and tuning File Converter with the goal of making the conversion and...
    Downloads: 54 This Week
    Last Update:
    See Project
  • 4
    Recorder

    Recorder

    HTML5 js recording mp3 wav ogg webm amr format

    ​Supports microphone recording and real-time processing in most of the implemented getUserMediamobile and PC browsers, mainly including Chrome, Firefox, Safari, iOS 14.3+, Android WebView, Tencent Android X5 kernel (QQ, WeChat, Mini Program WebView) , uni-app (App, H5), and most Android phones updated after 2021 have their own browsers; do not support: UC-based kernel (typical Alipay), most of the old domestic mobile phones that have not been updated have their own browsers and any other...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 5
    OpenAI.fm

    OpenAI.fm

    Code for openai.fm, a demo for the OpenAI Speech API

    OpenAI.fm is an official interactive demo application built to showcase the OpenAI Speech API and its advanced text-to-speech capabilities, providing developers and creators with a hands-on web interface to convert text into high-quality, customizable audio using state-of-the-art TTS models. Developed using Next.js and the OpenAI Speech API, this demo illustrates how the latest neural voice models can produce natural, expressive speech with adjustable styles and voices, highlighting features like emotional range, tone, and real-time playback. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 6
    idonthavespotify

    idonthavespotify

    Effortlessly convert Spotify links to your preferred streaming service

    Copy a link from your favorite streaming service, paste it into the search bar, and voilà! Links to the track on all other supported platforms are displayed. If the original source is Spotify you'll even get a quick audio preview to ensure it's the right track.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    Speakr

    Speakr

    Speakr is a personal, self-hosted web application

    Speakr is an open-source, real-time text-to-speech (TTS) web application that allows users to convert written text into natural-sounding speech in just a few clicks. It provides a clean, user-friendly interface where users can input text, choose a voice style or language, and immediately hear the output, making it ideal for accessibility, content creation, and learning applications. Behind the scenes, Speakr leverages modern TTS engines and streaming audio technologies to deliver smooth and responsive speech generation without noticeable delay. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    ebook2audiobook

    ebook2audiobook

    Generate audiobooks from e-books, voice cloning & 1107+ languages

    ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes.
    Downloads: 32 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    A2M — Audio to MIDI

    A2M — Audio to MIDI

    A2M is a desktop app that converts AUDIO TO MIDI in one click.

    A2M (Audio To MIDI) is a simple desktop tool for transcribing local audio files into MIDI files with one click. It is designed primarily for piano recording transcription, and works best on solo piano recordings. Using A2M is straightforward: Select an audio file, click Convert, and the application generates a MIDI file automatically in your Downloads/A2M folder.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 11
    Scriberr

    Scriberr

    Self-hosted AI audio transcription

    Scriberr is a self-hosted AI-powered transcription platform designed to convert audio and video into highly accurate text while prioritizing privacy and local processing. Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    Markdownify MCP Server

    Markdownify MCP Server

    Convert files and web content into clean, usable Markdown easily

    Markdownify MCP is a Model Context Protocol server that converts many types of files and web content into clean Markdown. It supports formats such as PDFs, images, audio with transcription, DOCX, XLSX, and PPTX, along with web sources like YouTube transcripts, Bing results, and general webpages. Markdownify MCP is designed to simplify content extraction and make data easier to read, share, and reuse in structured workflows. Developers can install dependencies, build, and run the server...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    AI-Media2Doc is a web-based application that uses large language models to convert video and audio content into structured, readable documents in a single workflow. It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not uploaded externally. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    OpenAudible

    OpenAudible

    Audiobook Manager for Audible Users

    OpenAudible is a cross-platform audiobook manager designed for Audible users. Manage/Download all your audiobooks with this easy-to-use desktop application. Say goodbye to the hassle of managing your audiobooks across multiple devices. With OpenAudible, you can easily download, view, and manage all your Audible books in one place. Our lightning-fast conversion to MP3 and M4B audio formats makes it easy to enjoy your favorite books on any device. Plus, our automation features make it a breeze...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    EPUB to Audiobook Converter

    EPUB to Audiobook Converter

    EPUB to audiobook converter, optimized for Audiobookshelf

    EPUB to Audiobook Converter is a tool designed to convert EPUB ebooks into chaptered audiobooks, optimized specifically for Audiobookshelf servers. It reads each chapter from an EPUB file, generates audio using a chosen text-to-speech backend, and outputs separate MP3 files with chapter titles preserved as metadata to make navigation easier. The project supports multiple TTS providers, including Microsoft Azure TTS, EdgeTTS, OpenAI TTS, local Piper, and Kokoro via an OpenAI-compatible endpoint, allowing users to choose between cloud and self-hosted voices. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 16
    SoX is the Swiss Army Knife of sound processing utilities. It can convert audio files to other popular audio file types and also apply sound effects and filters during the conversion.
    Leader badge
    Downloads: 27,192 This Week
    Last Update:
    See Project
  • 17
    Qwen3-ASR

    Qwen3-ASR

    Qwen3-ASR is an open-source series of ASR models

    Qwen3-ASR is an automatic speech recognition system in the QwenLM family, developed to convert spoken language into text with strong accuracy and real-time performance. As a specialized ASR variant of the broader Qwen language model ecosystem, it focuses on capturing reliable transcriptions from audio sources such as recordings, live streams, or conversational inputs while supporting low latency use cases. The architecture combines advanced neural acoustic modeling with context-aware language prediction so that outputs maintain both fidelity to the original speech and grammatical coherence. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    ffmpeg.wasm

    ffmpeg.wasm

    FFmpeg for browser, powered by WebAssembly

    ffmpeg.wasm is a pure WebAssembly (and JavaScript/TypeScript) port of FFmpeg that enables in-browser media recording, conversion, and streaming—letting developers perform video/audio processing entirely client-side without server uploads. Transpiled via Emscripten from FFmpeg and its codecs into WebAssembly. Supports both single-threaded and multi-threaded cores using web workers. Written in TypeScript for improved developer experience.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 19
    NExT-GPT

    NExT-GPT

    Code and models for ICML 2024 paper, NExT-GPT

    NExT-GPT is an open-source research framework that implements an advanced multimodal large language model capable of understanding and generating content across multiple modalities. Unlike traditional models that primarily handle text, NExT-GPT supports input and output combinations involving text, images, video, and audio in a unified architecture. The system connects a large language model with multimodal encoders and diffusion-based decoders so it can interpret information from different sensory formats and generate responses in different media types. This architecture allows the model to convert between modalities, such as generating images from text descriptions or producing audio or video outputs based on textual prompts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    clone-voice

    clone-voice

    A sound cloning tool with a web interface, using your voice

    ...The tool supports around sixteen languages, including Chinese, English, Japanese, Korean, French, German, Italian, and others, and can capture reference voices directly from a microphone or from uploaded audio.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    XLD

    XLD

    A tool for transcoding lossless audio files

    X Lossless Decoder(XLD) is a tool for Mac OS X that is able to decode/convert/play various 'lossless' audio files. The supported audio files can be split into some tracks with cue sheet when decoding. It works on Mac OS X 10.4 and later.
    Leader badge
    Downloads: 4,360 This Week
    Last Update:
    See Project
  • 22
    wa-automate-nodejs

    wa-automate-nodejs

    WhatsApp tool for chatbots with advanced features

    wa-automate-nodejs is the most advanced NodeJS library which provides a high-level API to control WA. Want to convert your WA account to an API instantly? You can now with the CLI. For more details see Easy API. After executing create() function, @open-wa/wa-automate will create an instance of WA web. If you are not logged in, it will print a QR code in the terminal. Scan it with your phone and you are ready to go! @open-wa/wa-automate will remember the session so there is no need to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Audiblez

    Audiblez

    Generate audiobooks from e-books

    Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    BasedHardware

    BasedHardware

    Open source AI wearable platform for recording and summarizing speech

    Omi is an open source AI wearable platform designed to capture spoken conversations and convert them into useful digital information such as transcripts, summaries, and action items. It combines hardware, firmware, mobile applications, and backend services to create a complete ecosystem for voice-driven interaction. Users can connect the wearable device to a mobile phone and automatically record and transcribe meetings, conversations, and voice memos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models are used to intelligently restructure subtitles into natural sentences, correct wording, and improve readability for viewers. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB