stable-video-diffusion-img2vid-xt download

Stable Video Diffusion Img2Vid XT is an advanced image-to-video latent diffusion model developed by Stability AI, designed to generate short video clips from a single static image. It produces 25 frames at 576x1024 resolution, offering improved temporal consistency by fine-tuning from an earlier 14-frame version. The model operates without text prompts and instead uses a single input frame to guide visual generation, making it ideal for stylized motion or animation. It includes both a standard frame-wise decoder and a fine-tuned f8-decoder to enhance coherence across frames. Despite its high quality, output videos are short (under 4 seconds) and not always fully photorealistic. Faces, text, and realistic motion may be inconsistently rendered, and the model cannot generate legible writing. It is suited for creative video generation, research, and educational applications under a community license, with image-level watermarking enabled by default.

Features

Converts a single image into a 25-frame video
Fine-tuned from the SVD 14-frame model for smoother motion
Outputs videos at 576x1024 resolution
Includes frame-wise and f8-decoder for improved temporal coherence
Supports latent diffusion for efficient generation
Intended for artistic, educational, and research purposes
Inference code includes watermarking via imWatermark
Developed with safety filtering and red-team evaluations for responsible use

Project Samples

stable-video-diffusion-img2vid-xt Screenshot 1

Project Activity

See All Activity >

Follow stable-video-diffusion-img2vid-xt

stable-video-diffusion-img2vid-xt Web Site

Other Useful Business Software

Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now

Rate This Project

User Reviews

Be the first to post a review of stable-video-diffusion-img2vid-xt!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Video Generators, Python AI Models

Registered

2025-06-27

Similar Business Software

Seaweed

Seaweed is a foundational AI model for video generation developed by ByteDance. It utilizes a diffusion transformer architecture with approximately 7 billion parameters, trained on a compute equivalent to 1,000 H100 GPUs. Seaweed learns world representations from vast multi-modal data, including...

See Software
ModelScope

This model is based on a multi-stage text-to-video generation diffusion model, which inputs a description text and returns a video that matches the text description. Only English input is supported. This model is based on a multi-stage text-to-video generation diffusion model, which inputs a...

See Software
Stable Video Diffusion

Stable Video Diffusion is designed to serve a wide range of video applications in fields such as media, entertainment, education, marketing. It empowers individuals to transform text and image inputs into vivid scenes and elevates concepts into live action, cinematic creations. Stable Video...

See Software