Search Results for "mega voice command"

Sort By:

Showing 94 open source projects for "mega voice command"

View related business solutions

Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

Real-Time Voice Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

...In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that captures voice characteristics; this embedding is then used by a Tacotron-style synthesizer to generate spectrograms from text, which a WaveRNN-based vocoder finally turns into audio. The repo includes both a command-line demo and a graphical “toolbox” application where you can load reference voices, type text, and hear the synthesized results interactively.

Downloads: 3 This Week

Last Update: 2026-03-09
See Project
2

FFsubsync

Automagically synchronize subtitles with video

Language-agnostic automatic synchronization of subtitles with video, so that subtitles are aligned to the correct starting point within the video. First, make sure ffmpeg is installed. Make sure ffmpeg is on your path and can be referenced from the command line! Next, grab the script. It should work with both Python 2 and Python 3. There may be occasions where you have a correctly synchronized srt file in a language you are unfamiliar with, as well as an unsynchronized srt file in your native language. In this case, you can use the correctly synchronized srt file directly as a reference for synchronization, instead of using the video as the reference. ffsubsync uses the file extension to decide whether to perform voice activity detection on the audio or to directly extract speech from an srt file. ffsubsync usually finishes in 20 to 30 seconds, depending on the length of the video.

Downloads: 44 This Week

Last Update: 2025-11-24
See Project
3

pyVideoTrans

Translate the video from one language to another and embed dubbing

...The tool supports both command-line and GUI modes, making it accessible to developers and creatives needing batch or automated processing.

Downloads: 21 This Week

Last Update: 2026-03-10
See Project
4

SafeClaw

Chat with it via text and voice

SafeClaw is an open-source, entirely local alternative to cloud-based AI assistants like OpenClaw, enabling users to build a personal assistant that runs on their own machine without incurring API usage charges or exposing data to third-party services. It emphasizes privacy and predictability by using traditional programming, rule-based intent parsing, and established machine learning tools rather than large language models, meaning there are no per-token API costs and deterministic...

Downloads: 7 This Week

Last Update: 2026-03-13
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
5

OpenAI-Compatible Edge-TTS API

Free, high-quality text-to-speech API endpoint to replace OpenAI

...A Docker image is provided for one-command deployment, and environment variables can be used to configure default voice, language, response format, authentication, and logging options.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
6

peon-ping

Warcraft III Peon voice notifications (+ more!) for Claude Code

Peon-ping is a quirky utility that brings fun and practical voice notifications to your development workflow by using Warcraft III peon-style sound effects whenever significant events occur in your code editor or terminal. The project is built around the idea of reducing cognitive load by audibly alerting you when processes finish, tests fail, or language models complete responses, helping you stay focused without constantly watching the screen.

Downloads: 3 This Week

Last Update: 17 hours ago
See Project
7

Audiblez

Generate audiobooks from e-books

Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...

Downloads: 5 This Week

Last Update: 2025-11-30
See Project
8

MLX-Audio

A text-to-speech, speech-to-text and speech-to-speech library

MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI (mlx_audio.tts.generate) as well as a Python API for programmatic generation of audio, including parameters for voice choice, speed, language hints, output format, and sample rate. ...

Downloads: 8 This Week

Last Update: 6 days ago
See Project
9

edge-tts

Use Microsoft Edge's online text-to-speech service from Python

edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications.

Downloads: 13 This Week

Last Update: 2025-12-12
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

Harbor LLM

Run a full local LLM stack with one command using Docker

Harbor is an open source, containerized toolkit designed to simplify running local large language model (LLM) environments. It combines a CLI and companion app to launch backends, frontends, and supporting services with minimal setup. With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user interfaces. It also includes tools for web retrieval, image generation, voice interaction, and workflow automation. ...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
11

Remove Windows Ai

Strip Windows 11 of built-in AI features for control and privacy

RemoveWindowsAI is an open source PowerShell-based tool created to help users regain control over their Windows 11 experience by disabling or removing AI-related features that Microsoft has increasingly integrated into the OS. It’s designed to work with currently released, stable versions of Windows 11 and continuously updated to match newly added AI components, especially since the 25H2 major update. The script covers a wide variety of AI surfaces (from core features like Copilot and Recall...

Downloads: 77 This Week

Last Update: 5 hours ago
See Project
12

MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server

MiniMax-MCP is the official Model Context Protocol (MCP) server for accessing MiniMax’s multimodal generative APIs from MCP-compatible clients. It acts as a bridge between tools like Claude Desktop, Cursor, Windsurf, OpenAI Agents, and the MiniMax platform, exposing capabilities such as text-to-speech, voice cloning, image generation, text-to-image, video generation, image-to-video, text-to-video, and music generation. The server is written in Python and distributed under the MIT license,...

Downloads: 0 This Week

Last Update: 2026-01-07
See Project
13

annyang!

Speech recognition for your site

annyang is a tiny javascript library that lets your visitors control your site with voice commands. annyang supports multiple languages, has no dependencies, weighs just 2kb and is free to use. annyang understands commands with named variables, splats, and optional words. Use named variables for one word arguments in your command. Use splats to capture multi-word text at the end of your command (greedy). Use optional words or phrases to define a part of the command as optional. annyang plays nicely with all browsers, progressively enhancing browsers that support SpeechRecognition, while leaving users with older browsers unaffected. ...

Downloads: 2 This Week

Last Update: 2026-03-11
See Project
14

Eris

A NodeJS Discord library

A Node.js wrapper for interfacing with Discord. You will need NodeJS 10.4+. If you need voice support you will also need Python 2.7 and a C++ compiler. Create a directory for your bot, and change to that directory in your command line. If you want to be more updated (at the expense of stability), you can install the beta builds instead. Eris supports a few optional libraries that could potentially improve bot performance but may require additional dependencies.

Downloads: 0 This Week

Last Update: 2024-09-22
See Project
15

Whisper-WebUI

A Web UI for easy subtitle using whisper model

Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.

Downloads: 1 This Week

Last Update: 2 days ago
See Project
16

Languine

Translate your application with Languine CLI powered by AI.

Languine is an AI-powered localization platform designed to automate and streamline the translation process for applications, ensuring seamless integration within development workflows. It offers intelligent, context-aware translations across over 100 languages, maintaining brand voice and tone consistency. It provides a command-line interface and continuous integration/continuous deployment integration, allowing developers to manage translations directly or automate them within existing pipelines. Languine supports various file formats, including JSON, YAML, Markdown, and more, catering to diverse project requirements. ...

Downloads: 0 This Week

Last Update: 2025-03-13
See Project
17

Open Interpreter

A natural language interface for computers

Open Interpreter is an open-source tool that provides a natural-language interface for interacting with your computer. It lets large language models (LLMs) run code locally (Python, JavaScript, shell, etc.), enabling you to ask your computer to do tasks like data analysis, file manipulation, browsing, etc. in human terms (“chat with your computer”), with safeguards. Runs locally or via configured remote LLM servers/inference backends, giving flexibility to use models you trust or have...

Downloads: 15 This Week

Last Update: 2025-09-12
See Project
18

Moltis

A Rust-native claw you can trust

Moltis is an open-source personal AI assistant platform written in Rust that is designed to run as a fully self-hosted, local-first agent environment. It compiles the entire assistant stack, including the web interface, model routing, memory, and tools, into a single self-contained binary with no external runtime dependencies. The system supports multiple large language model providers alongside local models, enabling users to maintain privacy while still accessing cloud capabilities when...

Downloads: 3 This Week

Last Update: 2026-03-09
See Project
19

BlackBelt WASTE - ipv4/Tor/i2p +AI+Voice

Modern, AI-Smart, WASTE p2p for ipv4, Tor and i2p + Voice Conference.

Open Source - GPLv3 inc images. A WASTE client. Download and create your own WASTE networks. Move 1000's of GB's at 100MB+ per sec. (800 Mbits per sec) FULL pause and resume capable. Voice Conference, Chat, Transfer files and Participate in Forums in a secure environment. For Windows XP 32/64, Vista 32/64, Win7 32/64, Win8 32/64, Win 10, Win 11, Linux (WINE). *** User Based Access Control - for voice, chats, file transfers and uploads. (useful in NULLNETS) *** Distributed...

1 Review

Downloads: 5 This Week

Last Update: 2026-03-01
See Project
20

Scribe

Free, open-source, and offline speech-to-text & voice control app.

> Scribe is a free and open-source desktop assistant that brings powerful speech-to-text and voice control capabilities directly to your PC. It allows you to dictate text into any application, create custom voice commands, launch programs, and automate your workflow with text replacements. > Designed with privacy as a top priority, Scribe works completely offline. Your voice data never leaves your computer. Powered by the Vosk engine, it supports multiple languages and provides...

Downloads: 77 This Week

Last Update: 2025-12-13
See Project
21

VoiceClip

VoiceClip es una aplicación de asistencia a usuarios

VoiceClip es una aplicación de asistencia a usuarios diseñada para integrarse de manera fluida en su entorno de trabajo, proporcionando un acceso rápido y eficiente a diversas funcionalidades mediante comandos de voz y texto. Presentada como una barra de herramientas que permanece siempre visible en primer plano, VoiceClip busca simplificar tareas comunes, mejorar la productividad y facilitar la interacción con su sistema operativo y con tecnologías avanzadas de inteligencia artificial

1 Review

Downloads: 0 This Week

Last Update: 2025-04-30
See Project
22

UV_Assistant

Fast and secure voice assistant for your windows pc!

UV Assistant can open applications using voice commands, UV Assistant can open websites with voice commands and etc.

1 Review

Downloads: 0 This Week

Last Update: 2024-01-27
See Project
23

KoboldCpp

Run GGUF models easily with a UI or API. One File. Zero Install.

KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original KoboldAI. It's a single self-contained distributable that builds off llama.cpp and adds many additional powerful features.

Downloads: 252 This Week

Last Update: 1 day ago
See Project
24

eGuideDog free software for the blind

eGuideDog project develops free software for the blind. Currently, we focus on WebSpeech, Ekho TTS and WebAnywhere.

16 Reviews

Downloads: 200 This Week

Last Update: 2 days ago
See Project
25

Besgnulinux

Based on Debian Stable Installation tool: Calamares Besgnulinux is for both new and weak machines and end users (newbies). Besgnulinux tries to be fast, lightweight, easy to use and stable. It is designed to meet every need with window manager sessions like JWM, Openbox. Instead of background elements, the system is under user control. It does the same things that high-resource desktops like KDE and Gnome do. But it does the job with very little resources. Besgnulinux's goal is to...

3 Reviews

Downloads: 436 This Week

Last Update: 2026-03-05
See Project