NÜWA - Pytorch download | SourceForge.net

Implementation of NÜWA, state of the art attention network for text-to-video synthesis, in Pytorch. It also contains an extension into video and audio generation, using a dual decoder approach. It seems as though a diffusion-based method has taken the new throne for SOTA. However, I will continue on with NUWA, extending it to use multi-headed codes + hierarchical causal transformer. I think that direction is untapped for improving on this line of work. In the paper, they also present a way to condition the video generation based on segmentation mask(s). You can easily do this as well, given you train a VQGanVAE on the sketches beforehand. Then, you will use NUWASketch instead of NUWA, which can accept the sketch VAE as a reference. This repository will also offer a variant of NUWA that can produce both video and audio. For now, the audio will need to be encoded manually.

Features

Train the VAE
Conditioning on Sketches
Text to video and audio
Te audio will need to be encoded manually
This library will offer some utilities to make training easier
This library depends on this vector quantization library

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow NÜWA - Pytorch

NÜWA - Pytorch Web Site

Other Useful Business Software

Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.

Start Free

Rate This Project

User Reviews

Be the first to post a review of NÜWA - Pytorch!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Video Generators, Python Generative AI

Registered

2023-03-22

Similar Business Software

LTX Studio

Control every aspect of your video using AI, from ideation to final edits, on one holistic platform. We’re pioneering the integration of AI and video production, enabling the transformation of a single idea into a cohesive, AI-generated video. LTX Studio empowers individuals to share their...

See Software
Synthesys

Synthesys is on the leading edge of developing algorithms for text to voice and videos for commercial use. Imagine being able to enhance your website explainer videos or product tutorials in a matter of minutes with the aid of a natural human voice. Synthesys Text-to-Speech (TTS) and Synthesys...

See Software
Picsart Enterprise

AI-Powered Image & Video Editing for Seamless Integration. Enhance your visual content workflows with Picsart Creative APIs, a robust suite of AI-driven tools for developers, product owners, and entrepreneurs. Easily integrate advanced image and video processing capabilities into your...

See Software

Report inappropriate content

NÜWA - Pytorch

Implementation of NÜWA, attention network for text to video synthesis

Get an email when there's a new version of NÜWA - Pytorch

Features

Project Samples

Project Activity

Categories

License

Follow NÜWA - Pytorch

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered