Lightning-fast, on-device TTS, running natively via ONNX
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Multilingual Document Layout Parsing in a Single Vision-Language Model
End-to-end speech processing toolkit
Open source AI VTuber platform with voice chat and Live2D avatars
text and image to video generation: CogVideoX (2024) and CogVideo
Framework for building neural networks
The open-source data curation platform for LLMs
Web-based tool converts GitHub repository contents
A Web UI for easy subtitle using whisper model
Jittor is a high-performance deep learning framework
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Bailing is a voice dialogue robot similar to GPT-4o
Stanford NLP Python library for many human languages
Matter AI is open-source AI Code Reviewer Agent
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
RAPIDS Machine Learning Library
Data loaders and abstractions for text and NLP
Self-hosted AI audio transcription
C++ image processing and machine learning library with using of SIMD
Open-source framework for conversational voice AI agents
Give your OpenClaw AI agent a WhatsApp number
computer vision projects | Fun AI projects related to computer vision
Refer and Ground Anything Anywhere at Any Granularity