Make-A-Video - Pytorch (wip) download

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch. They combine pseudo-3d convolutions (axial convolutions) and temporal attention and show much better temporal fusion. The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks". The gist of the paper comes down to, take a SOTA text-to-image model (here they use DALL-E2, but the same learning points would easily apply to Imagen), make a few minor modifications for attention across time and other ways to skimp on the compute cost, do frame interpolation correctly, get a great video model out. Passing in images (if one were to pretrain on images first), both temporal convolution and attention will be automatically skipped. In other words, you can use this straightforwardly in your 2d Unet and then port it over to a 3d Unet once that phase of the training is done.

Features

The temporal modules are initialized to output identity as the paper had done
You can also control the two modules so that when fed 3-dimensional features, it only does training spatially
Full SpaceTimeUnet that is agnostic to images or video training, and where even if video is passed in, time can be ignored
Passing in images (if one were to pretrain on images first), both temporal convolution and attention will be automatically skipped
The gist of the paper comes down to, take a SOTA text-to-image model
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch

Project Samples

Make-A-Video - Pytorch (wip) Screenshot 1

Make-A-Video - Pytorch (wip) Screenshot 2

Project Activity

See All Activity >

License

MIT License

Follow Make-A-Video - Pytorch (wip)

Make-A-Video - Pytorch (wip) Web Site

Other Useful Business Software

Skillfully - The future of skills based hiring

Realistic Workplace Simulations that Show Applicant Skills in Action

Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.

Learn More

Rate This Project

User Reviews

Be the first to post a review of Make-A-Video - Pytorch (wip)!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Video Generators, Python Deep Learning Frameworks, Python Generative AI

Registered

2023-03-22

Similar Business Software

LTX

Control every aspect of your video using AI, from ideation to final edits, on one holistic platform. We’re pioneering the integration of AI and video production, enabling the transformation of a single idea into a cohesive, AI-generated video. LTX empowers individuals to share their visions,...

See Software
LM-Kit.NET

LM-Kit.NET is a cutting-edge, high-level inference SDK designed specifically to bring the advanced capabilities of Large Language Models (LLM) into the C# ecosystem. Tailored for developers working within .NET, LM-Kit.NET provides a comprehensive suite of powerful Generative AI tools, making...

See Software
Google Cloud Platform

Google Cloud is a cloud-based service that allows you to create anything from simple websites to complex applications for businesses of all sizes. New customers get $300 in free credits to run, test, and deploy workloads. All customers can use 25+ products for free, up to monthly usage...

See Software
New Relic

There are an estimated 25 million engineers in the world across dozens of distinct functions. As every company becomes a software company, engineers are using New Relic to gather real-time insights and trending data about the performance of their software so they can be more resilient and...

See Software
Pipedrive

Pipedrive is a web-based sales CRM (customer relationship management) software that lets sales teams track pipelines, optimize leads, manage deals and automate their entire sales process to focus on selling. Pipedrive’s simple interface empowers salespeople to streamline workflows and unite...

See Software
Picsart Enterprise

AI-Powered Image & Video Editing for Seamless Integration. Enhance your visual content workflows with Picsart Creative APIs, a robust suite of AI-driven tools for developers, product owners, and entrepreneurs. Easily integrate advanced image and video processing capabilities into your...

See Software

Report inappropriate content