CogVideo

CogVideo is an open source text-/image-/video-to-video generation project that hosts the CogVideoX family of diffusion-transformer models and end-to-end tooling. The repo includes SAT and Diffusers implementations, turnkey demos, and fine-tuning pipelines (including LoRA) designed to run across a wide range of NVIDIA GPUs, from desktop cards (e.g., RTX 3060) to data-center hardware (A100/H100). Current releases cover CogVideoX-2B, CogVideoX-5B, and the upgraded CogVideoX1.5-5B variants, plus image-to-video (I2V) models, with options for BF16/FP16/FP32—and INT8 quantized inference via TorchAO for memory-constrained setups. The codebase emphasizes practical deployment: prompt-optimization utilities (LLM-assisted long-prompt expansion), Colab notebooks, a Gradio web app, and multiple performance knobs (tiling/slicing, CPU offload, torch.compile, multi-GPU, and FA3 backends via partner projects).

Features

Multiple tasks: text-to-video, image-to-video, and video-to-video generation.
Dual stacks: SAT implementations and Diffusers pipelines with shared demos.
Fine-tuning recipes (incl. LoRA), plus cogvideox-factory for single-GPU (4090) training.
Quantized inference (INT8 via TorchAO) and memory optimizations (CPU offload, tiling, slicing).
Ready-to-run assets: Colab notebooks, CLI demos, and a Gradio web UI with tools (SR/interp).
Utilities & ecosystem: weight converters (SAT→HF), captioning tools, and third-party integrations (ComfyUI, ControlNet, xDiT, VideoSys).

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow CogVideo

CogVideo Web Site

Other Useful Business Software

Gen AI apps are built with MongoDB Atlas

Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.

Start Free

Rate This Project

User Reviews

Be the first to post a review of CogVideo!

Additional Project Details

Operating Systems

Linux, Windows

Programming Language

Python, Unix Shell

Related Categories

Unix Shell AI Video Generators, Python AI Video Generators

Registered

2025-10-04

Similar Business Software

CogVideoX

CogVideoX is a text-to-video generation tool. Before running the model, please refer to this guide to see how we use the GLM-4 model to optimize the prompt. This is crucial because the model is trained with long prompts, and a good prompt directly affects the quality of the generated video....

See Software
ModelsLab

ModelsLab is an innovative AI company that provides a comprehensive suite of APIs designed to transform text into various forms of media, including images, videos, audio, and 3D models. Their services enable developers and businesses to create high-quality visual and auditory content without the...

See Software
Seaweed

Seaweed is a foundational AI model for video generation developed by ByteDance. It utilizes a diffusion transformer architecture with approximately 7 billion parameters, trained on a compute equivalent to 1,000 H100 GPUs. Seaweed learns world representations from vast multi-modal data, including...

See Software
FramePack AI

FramePack AI revolutionizes video creation by enabling the generation of long, high-quality videos on consumer GPUs with just 6 GB of VRAM, using smart frame compression and bi-directional sampling to maintain constant computational load regardless of video length while avoiding drift and...

See Software
LTX

Control every aspect of your video using AI, from ideation to final edits, on one holistic platform. We’re pioneering the integration of AI and video production, enabling the transformation of a single idea into a cohesive, AI-generated video. LTX empowers individuals to share their visions,...

See Software
Wan2.2

Wan2.2 is a major upgrade to the Wan suite of open video foundation models, introducing a Mixture‑of‑Experts (MoE) architecture that splits the diffusion denoising process across high‑noise and low‑noise expert paths to dramatically increase model capacity without raising inference cost. It...

See Software

Report inappropriate content

CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo

Get an email when there's a new version of CogVideo

Features

Project Samples

Project Activity

Categories

License

Follow CogVideo

User Reviews

Additional Project Details

Operating Systems

Programming Language

Related Categories

Registered