Showing 1257 open source projects for "video to convert text"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 55 This Week
    Last Update:
    See Project
  • 2
    Olive Video Editor

    Olive Video Editor

    Free open-source non-linear video editor

    0.2 is the upcoming major release of Olive. It's a complete rewrite from the ground up designed around cutting-edge features to help you make the best videos possible. Olive 0.2 provides powerful and flexible node-based compositing. Node editing is a form of visual programming that gives you full control over how Olive renders your video. Rather than a "fixed" pipeline where one effect occurs after the other, nodes allow you to connect anything to anything else allowing a ton of flexibility...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 3
    Make-A-Video - Pytorch (wip)

    Make-A-Video - Pytorch (wip)

    Implementation of Make-A-Video, new SOTA text to video generator

    Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch. They combine pseudo-3d convolutions (axial convolutions) and temporal attention and show much better temporal fusion. The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks". The gist of the paper comes down to, take a SOTA text-to-image model (here they use DALL-E2, but the same learning...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Video Diffusion - Pytorch

    Video Diffusion - Pytorch

    Implementation of Video Diffusion Models

    Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. It uses a special space-time factored U-net, extending generation from 2D images to 3D videos. 14k for difficult moving mnist (converging much faster and better than NUWA) - wip. Any new developments for text-to-video synthesis will be centralized at Imagen-pytorch...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Get the most trusted enterprise browser Icon
    Get the most trusted enterprise browser

    Advanced built-in security helps IT prevent breaches before they happen

    Defend against security incidents with Chrome Enterprise. Create customizable controls, manage extensions and set proactive alerts to keep your data and employees protected without slowing down productivity.
    Download Chrome
  • 5
    yt-dlp

    yt-dlp

    A youtube-dl fork with additional features and fixes

    yt-dlp is a youtube-dl fork based on the now inactive youtube-dlc. The main focus of this project is adding new features and patches while also keeping up to date with the original project
    Downloads: 217 This Week
    Last Update:
    See Project
  • 6
    LZ4

    LZ4

    Extremely fast compression algorithm

    LZ4 is lossless compression algorithm, providing compression speed > 500 MB/s per core (>0.15 Bytes/cycle). It features an extremely fast decoder, with speed in multiple GB/s per core (~1 Byte/cycle). A high compression derivative, called LZ4_HC, is available, trading customizable CPU time for compression ratio. LZ4 library is provided as open-source software using a BSD license. This benchmark simulates simple "static content transfer" scenario such as OS Kernel compression or video game's...
    Downloads: 265 This Week
    Last Update:
    See Project
  • 7
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    ..., color tone, and more, for high-quality, customizable video styles. The model is trained on significantly larger datasets than its predecessor, greatly enhancing motion complexity, semantic understanding, and aesthetic diversity. Wan2.2 also open-sources a 5-billion parameter high-compression VAE-based hybrid text-image-to-video (TI2V) model that supports 720P video generation at 24fps on consumer-grade GPUs like the RTX 4090. It supports multiple video generation tasks including text-to-video.
    Downloads: 114 This Week
    Last Update:
    See Project
  • 8
    PDFsam

    PDFsam

    PDFsam, a desktop application to split, merge, mix, rotate PDF files

    PDFsam Basic is our free and open-source desktop application to split, merge, extract pages, rotate and mix PDF files. PDFsam Visual is a powerful tool to visually compose PDF files, reorder pages, delete pages, split, merge, rotate, encrypt, decrypt, extract text, convert to grayscale, crop PDF files. PDFsam Basic is written using JavaFX. Since version 4 it is released as a self-contained application and bundles a jlinked JDK while version 3 requires a Java Runtime Environment 8 with JavaFx...
    Downloads: 97 This Week
    Last Update:
    See Project
  • 9
    Subtitle Edit

    Subtitle Edit

    The subtitle editor

    Subtitle Edit (SE) is a free, open‑source subtitle editor for creating, editing, synchronizing, and converting subtitles. It supports a wide range of formats (over 300) and offers both graphical and text-based editing views.  Easy insertion, deletion, and shift of subtitle lines. Portable versions available (.NET 4.8, 32/64-bit), runs on Windows and via compatibility on Linux. Active development with frequent updates and issue tracking. Plugin support and rich editing tools (e.g., translation...
    Downloads: 77 This Week
    Last Update:
    See Project
  • Photo and Video Editing APIs and SDKs Icon
    Photo and Video Editing APIs and SDKs

    Trusted by 150 million+ creators and businesses globally

    Unlock Picsart's full editing suite by embedding our Editor SDK directly into your platform. Offer your users the power of a full design suite without leaving your site.
    Learn More
  • 10
    HandBrake

    HandBrake

    A open source video to convert video from any format to modern codecs

    HandBrake is an open-source, GPL-licensed, multiplatform, multithreaded video transcoder, available for MacOS X, Linux and Windows.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 11
    Open-Sora

    Open-Sora

    Open-Sora: Democratizing Efficient Video Production for All

    Open-Sora is an open-source initiative aimed at democratizing high-quality video production. It offers a user-friendly platform that simplifies the complexities of video generation, making advanced video techniques accessible to everyone. The project embraces open-source principles, fostering creativity and innovation in content creation. Open-Sora provides tools, models, and resources to create high-quality videos, aiming to lower the entry barrier for video production and support diverse...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 12
    Wan2.1

    Wan2.1

    Wan2.1: Open and Advanced Large-Scale Video Generative Model

    Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports text...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 13
    my-tv

    my-tv

    Android IPTV player supporting custom TV sources and quick install

    my‑tv is a popular open-source Android app for live TV streaming via custom IPTV sources. With over 32k GitHub stars, it allows users to watch TV channels by installing directly on devices like TV boxes via APK or ADB. Features include USB and Xiaomi TV Assistant install options, remote‑configurable channel lists via QR code or JSON, and flexible support for text, M3U, and JSON playlists. The app emphasizes simplicity—install and play—with a community-driven source list and regular enhancements...
    Downloads: 54 This Week
    Last Update:
    See Project
  • 14
    Nextcloud Server

    Nextcloud Server

    A safe home for all your data

    Nextcloud server is a free and open source server software that allows you to store all of your data in a server of your choosing. With Nextcloud you can easily access and store data in the data center you trust, sync data among various devices, and share your data for collaboration purposes. It offers the best security in the self hosted file sync and share world, and is expandable with hundreds of apps.
    Downloads: 46 This Week
    Last Update:
    See Project
  • 15
    p5.js

    p5.js

    Client-side JS platform for artists, designers and students to express

    ... objects for text, input, video, webcam, and sound. p5.js is an interpretation of Processing for today’s web. We hold events and operate with support from the Processing Foundation. For self-learners and animators, artists, game makers, creative-technologists, curriculum planners, designers, graphic designers, graphics editors, learning experience designers, project managers, software engineer, student, teachers, university faculty members, visualization researchers, etc.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 16
    WebCord

    WebCord

    A Discord and SpaceBar :electron:-based client

    Nowadays, WebCord is quite complex project; it can be summarized as a pack of security and privacy hardenings, Discord features reimplementations, Electron / Chromium / Discord bugs workarounds, stylesheets, internal pages and wrapped Discord page, designed to conform with ToS as much as it is possible (or hide the changes that might violate it from Discord's eyes). WebCord does a lot to improve the privacy of the users. It blocks known tracing and fingerprinting methods, but it does not end...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 17
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 18
    Caprine

    Caprine

    Elegant Facebook Messenger desktop app

    ... Settings. Convert your messenger to a dark theme. Ability to toggle last seen/typing indicators. Interface adapts to smaller sizes. In-house notifications to keep you up to date. Caprine is a third-party app and is not affiliated with Facebook. You can toggle dark mode in the View menu or with Command d / Control d.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 19
    Open QR Code

    Open QR Code

    Open QR Code is an open-source, cross-platform app

    Open QR Code is an open-source cross-platform application developed using Flutter as main framework used to build the application, in common C, C++, Dart, Skia (a 2D rendering engine), and Impeller (the default rendering engine on iOS), Java, Kotlin. Open QR Code allows users to generate and scan QR codes effortlessly. The app is available on Android, Windows, and the Web. Users can generate QR codes from any text input, save them to their gallery, share them directly from the app, and scan QR...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 20
    CogVideo

    CogVideo

    text and image to video generation: CogVideoX (2024) and CogVideo

    CogVideo is an open source text-/image-/video-to-video generation project that hosts the CogVideoX family of diffusion-transformer models and end-to-end tooling. The repo includes SAT and Diffusers implementations, turnkey demos, and fine-tuning pipelines (including LoRA) designed to run across a wide range of NVIDIA GPUs, from desktop cards (e.g., RTX 3060) to data-center hardware (A100/H100). Current releases cover CogVideoX-2B, CogVideoX-5B, and the upgraded CogVideoX1.5-5B variants, plus...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 21
    ImageMagick

    ImageMagick

    ImageMagick 7

    ImageMagick® is a free, open-source software suite, used for editing and manipulating digital images. It can be used to create, edit, compose, or convert bitmap images, and supports a wide range of file formats, including JPEG, PNG, GIF, TIFF, and PDF. ImageMagick is widely used in industries such as web development, graphic design, and video editing, as well as in scientific research, medical imaging, and astronomy. Its versatile and customizable nature, along with its robust image processing...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 22
    Translate-Subtitle-File

    Translate-Subtitle-File

    Subtitle Creation Assistant

    Subtitle group machine translation assistant - [Function 1: Translate subtitle file] .srt .ass .vtt [Function 2: Voice to text] (Drag in video or audio to recognize subtitles) (The latest version v4.1.0 Update time 2021 2 May 23) 12 translation service providers can be configured, such as Google, Baidu, Tencent, Caiyun, IBM, Azure, Amazon, etc. (6 voice service providers can be configured: Alibaba Cloud, Xunfei, Tencent Cloud, IBM, Azure, Amazon ) Advantages: 1. You can use multiple service...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 23
    Signal Desktop

    Signal Desktop

    Private messenger for Windows, Mac, and Linux

    Say "hello" to a different messaging experience. An unexpected focus on privacy, combined with all of the features you expect. State-of-the-art end-to-end encryption (powered by the open source Signal Protocol) keeps your conversations secure. We can't read your messages or listen to your calls, and no one else can either. Privacy isn’t an optional mode, it’s just the way that Signal works. Every message, every call, every time. Share text, voice messages, photos, videos, GIFs and files...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 24
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    ... be used to prepare raw data or improve existing training data to get more accurate ML models. The frontend part of Label Studio app lies in the frontend/ folder and written in React JSX. Multi-user labeling sign up and login, when you create an annotation it's tied to your account. Configurable label formats let you customize the visual interface to meet your specific labeling needs. Support for multiple data types including images, audio, text, HTML, time-series, and video.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 25
    TTS Voice Wizard

    TTS Voice Wizard

    Speech to Text to Speech, sends text as OSC messages

    Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) Use TTS Voice Wizard's accessibility features to improve your VRChat experience (it works outside of VRChat too!) You can convert your Speech-to-Text and back to Speech through various Speech Recognition and Text-to-Speech methods. You can send what you say as OSC messages to VRChat to be displayed on your avatar using KillFrenzyAvatarText or VRChats...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.