A minimal implementation of diffusion models of text: learns a diffusion model of a given text corpus, allowing to generate text samples from the learned model. The main idea was to retain just enough code to allow training a simple diffusion model and generating samples, remove image-related terms, and make it easier to use. To train a model, run scripts/train.sh. By default, this will train a model on the simple corpus. However, you can change this to any text file using the --train_data argument. Note that you may have to increase the sequence length (--seq_len) if your corpus is longer than the simple corpus. The other default arguments are set to match the best setting I found for the simple corpus.

Features

  • Training from scratch on the greetings dataset
  • Experiments with using pre-trained models and embeddings
  • Controllable Generation
  • A minimal implementation of diffusion models of text
  • Generate text samples from the learned model
  • Opportunities for further minimization

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Minimal text diffusion

Minimal text diffusion Web Site

Other Useful Business Software
MongoDB Atlas runs apps anywhere Icon
MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Minimal text diffusion!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Text Generators, Python Generative AI

Registered

2023-03-23