Search Results for "converting transcript to audio"

Showing 101 open source projects for "converting transcript to audio"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Podcastfy.ai

    Podcastfy.ai

    Transforming Multimodal Content into Captivating Multilingual Audio

    Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on the go, or for users who prefer audio over reading. The repository supports handling common ebook formats and generating outputs that combine audio plus caption metadata. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    Meetily

    Meetily

    Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper

    This project is a privacy-first AI meeting assistant that captures meeting audio, produces real-time transcripts, and generates summaries while keeping processing entirely on your own machine or infrastructure. It’s built for organizations that want meeting intelligence without sending recordings or transcripts to third-party cloud services, which helps address compliance and data sovereignty requirements. The app supports live transcription with local model options (including Whisper- and Parakeet-based workflows) and presents the transcript as the meeting happens, making it useful both for note-taking and accessibility. ...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 4
    MARS5

    MARS5

    MARS5 speech model (TTS) from CAMB.AI

    ...The model is built to handle prosodically challenging content such as sports commentary, anime dialogue, and other high-energy or highly varied speech patterns with realistic rhythm and intonation. To control speaker identity, MARS5 uses a short reference audio clip, typically between 2 and 12 seconds, from which it learns the voice characteristics. It supports two main inference modes: shallow clone, which is faster and only needs the reference audio, and deep clone, which additionally uses the transcript of the reference audio to increase similarity and naturalness at the cost of more computation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    AI-Media2Doc is a web-based application that uses large language models to convert video and audio content into structured, readable documents in a single workflow. It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Agili Hacker Podcast

    Agili Hacker Podcast

    AI tool that turns Hacker News posts into daily podcast updates

    Hacker Podcast is an AI-powered project that turns top Hacker News stories into a Chinese podcast. It automatically fetches trending posts each day, processes the content with AI, and generates concise summaries before converting them into audio. This creates a hands-free way to stay updated on tech, startups, and developer discussions without reading long threads. Hacker Podcast combines content aggregation, natural language processing, and text-to-speech to deliver clear and digestible updates. Users can listen through web interfaces or podcast platforms, while also accessing written summaries for deeper reading. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    LARA is software for musical analysis using (new) scientific methods for analysis and visualization. LARA is part of the core research: “Interpretation and performance” of the HSLU – Musik (University of Applied Sciences Luzern – Music depart
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the translated subtitles, and then merges that speech back into the video, creating a fully localized media file. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    easyVoice is an open-source text-to-speech platform aimed at turning long-form text and novels into high-quality audio, with a strong focus on usability and scalability. It provides a web interface where users can paste or upload large texts and generate speech and subtitles in a single workflow, even for works exceeding 100,000 characters. The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 10
    Docling

    Docling

    Get your documents ready for gen AI

    ...Its modular architecture allows developers to extend functionality and integrate specialized models for tasks such as OCR and audio transcription. Overall, Docling serves as a comprehensive preprocessing layer for AI applications that require reliable, structured access to complex document data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    VoxCPM

    VoxCPM

    TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

    ...Trained on a large 1.8-million-hour bilingual corpus, VoxCPM can infer appropriate speaking style from context, dynamically adjusting intonation, rhythm, and emotional tone. It supports zero-shot voice cloning from a short reference audio clip, capturing timbre, accent, and pacing to closely mimic a target speaker without per-speaker fine-tuning.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    VidCoder

    VidCoder

    A Blu-ray, DVD and video file transcoder for Windows

    VidCoder is a Windows-based open-source video transcoding and ripping tool that provides a graphical interface built around standard command-line multimedia tools. It lets users convert video files (or rip DVDs/Blu-rays, when supported) into modern formats and codecs, making it useful for people who want to compress, re-encode, or transcode video content without dealing directly with low-level encoder settings. Because VidCoder integrates and automates the invocation of complex backend...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    JarkViewer

    JarkViewer

    A lightweight, lightning-fast, and powerful image viewer

    ...The viewer handles static images, animated formats (GIF, animated WebP/PNG/APNG/JXL/AVIF), and special “live” photo types such as iOS Live Photos (.livp) and Android Motion Photos/Micro Videos, although audio playback for live photos is not yet supported. It also supports a broad set of RAW formats from various camera manufacturers, allowing photographers to browse their libraries without converting to JPEG first.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 14
    Streamer-Sales

    Streamer-Sales

    LLM Large Model of Selling Anchor

    Streamer-Sales is an open-source large language model system designed specifically for e-commerce live streaming and automated product promotion. The project focuses on generating persuasive product descriptions and live presentation scripts that mimic the style of professional online sales hosts. By analyzing product characteristics and marketing information, the model can produce engaging explanations that emphasize benefits, features, and emotional appeal to encourage viewers to make...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    CC2.TV / CC2 - Audio- und TV-Datenbank

    CC2.TV / CC2 - Audio- und TV-Datenbank

    Meta-Datenbank-Anwendung für die Audio- und TV-Sendungen des CC2.TV

    Dieses Programm stellt eine Meta-Datenbank-Anwendung für die Audio- und Video-Sendungen des CC2.TV für GNU/Linux Systeme zur Verfügung. Es ermöglicht das Durchsuchen, Verwalten und Abspielen der umfangreichen Inhalte des CC2.TV-Audiocasts und -Videocasts. Ziel ist es, die über 3000 Audiocast-Themen und über 1000 Videocast-Themen, die sich auf Computerthemen, Technik und gesellschaftliche Aspekte konzentrieren, komfortabel zugänglich zu machen. Für die volle Funktionalität,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MediaCoder

    MediaCoder

    Universal media transcoding software

    MediaCoder is a universal media transcoding software actively developed and maintained since 2005. It puts together most cutting-edge audio/video technologies into an out-of-box transcoding solution with a rich set of adjustable parameters which let you take full control of your transcoding. New features and latest codecs are added or updated constantly. MediaCoder might not be the easiest tool out there, but what matters here is quality and performance. It will be your swiss army knife for...
    Leader badge
    Downloads: 792 This Week
    Last Update:
    See Project
  • 17
    WAV-PRG
    WAV-PRG is a program for converting Commodore 64 tapes to PC and back. It is designed not to require any custom-built cables: transfers between PC and tape are done by means of a tape player/recorder connected to the PC's soundcard by a plain audio cable
    Leader badge
    Downloads: 83 This Week
    Last Update:
    See Project
  • 18
    Ainee

    Ainee

    Ainee - AI Notetaking and Learning Companion

    Ainee is your ultimate AI-powered notetaking and learning companion. Capture lecture notes in real-time and effortlessly transform audio, text, files, and YouTube videos into formatted notes, mindmaps, quizzes, flashcards, podcasts, and more. Explore our AI meeting note taker, AI notes, video transcript generator, PDF to AI converter, and AI flashcard maker. Enhance your learning with our AI voice recorder, article summarizer AI, and AI quiz generator. Additionally, share your knowledge base with others to foster the flow of information and help new users benefit from collective insights. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Vimer

    Vimer

    Adjust or convert audios and videos just by describing what you want

    Vimer (VIdeo transforMER) is an AI powered GUI for FFmpeg, a cross-platfom and multilanguage app with a hassle-free interface for adjusting audio and video. You just need to choose the files you want to change, describe what you want and let the artificial intelligence take care of the rest, automatically generating and executing the necessary FFmpeg commands. Forget complicated codes. Whether converting formats, adjusting quality, adding effects, adjusting audio, or mixing different media, Vimer offers a simple path to media conversion and editing without the need for advanced technical knowledge. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    footswitch2basic

    footswitch2basic

    Audio Transcription software for Linux (Vlc) with a foot pedal

    Footswitch 2 (Basic) is a media player for transcribers on Linux. This version is a stripped down version of Footswitch2, containing only the absolute essentials for transcription. Written in python and using the python bindings for VLC it allows a transcriber to control the audio or video with a footpedal, and includes a set of macros that integrate into LibreOffice. This allows the transcriber to control the media player from within Libreoffice as well, making it useful for those who do...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    N_m3u8DL-CLI

    N_m3u8DL-CLI

    Simple CLI tool to download m3u8 streams to MP4/TS with rich options

    N_m3u8DL-CLI is a cross-platform command-line downloader for m3u8 (HLS) playlists. It converts streams to MP4 or TS and offers rich command-line options. While the original CLI uses .NET Framework (Windows), its successor N_m3u8DL-RE adds true cross-platform support for Windows, Linux, and macOS. Common in media workflows for downloading and converting streaming video.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 22
    JAFG - Just Another FFmpeg GUI
    JAFG or Just Another FFmpeg GUI is an interface to FFmpeg. JAFG allows conversion of audio to audio file, conversion of video to video files. JAFG allows changing of the Audio Bitrate, Audio Sampling Rate, Audio Channels, Video Codec, Video Bitrate, Video Size, Aspect, Framerate. JAFG also allows converting to DVD, DV, VCD, SVCD and can be pal, ntsc, film. JAFG allows capture of screenshots and screen recording, and Youtube downloading.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    YouTube To Mp4 Converter

    YouTube To Mp4 Converter

    Turn youtube video into mp4 highest quality

    YouTube To Mp4 Converter is a free PC software through which you can easily and quickly convert YouTube videos to Mp4 HD. Particularly, it allows you to choose the output high quality of your Mp4 videos such as 720p, 1080p, 1440p, 2160p. This software has no limit to the video size. Anybody can easily install YouTube To Mp4 Converter online free. Using the right software for converting YouTube to Mp4 HD will save you bandwidth as you don’t have to stream the same video. You can keep the...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24

    ffmpeg-coder

    A small CLI tool which will help in producing different types of video

    A small cross platform CLI tool which will help in producing different types video converting codes quickly for new ffmpeg CLI users. The tool will make it easy to generate ffmpeg commands for them. If you don't use or know about FFMPEG then this tool might not be for you. This tool is created for those who find it difficult to understand or find the appropriate commands for converting a video as they want using the FFMPEG CLI tool. Also those who are new to Command Lines they can also...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Perl Audio Converter

    Perl Audio Converter

    Linux Audio Converter / Tagger / CD Ripper

    A Linux CLI tool for converting multiple audio types from one format to another. It supports the following audio formats: 3G2, 3GP, 8SVX, AAC, AC3, ADTS, AIFF, AL, AMB, AMR, APE, AU, AVR, BONK, CAF, CDR, CVU, DAT, DTS, DVMS, F32, F64, FAP, FLA, FLAC, FSSD, GSRT, HCOM, IMA, IRCAM, LA, MAT, AUD, MAT4, MAT5, M4A, M4R, MP2, MP3, MP4, MP4A, MPC, MPP, NIST, OFF, OFR, OFS, OPUS, OGA,OGG, PAF, PRC, PVF, RA, RAM, RAW, RF64, SD2, SF, SHN, SMP, SND,SOU, SPX, SRN, TAK, TTA, TXW, VOC, VMS, VQF, W64, WAV, WMA, and WV. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB