The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the SDV project, or input your own data. Choose from any of the SDV synthesizers and baselines. Or write your own custom machine learning model. In addition to performance and memory usage, you can also measure synthetic data quality and privacy through a variety of metrics. Install SDGym using pip or conda. We recommend using a virtual environment to avoid conflicts with other software on your device.

Features

  • Benchmark synthetic data generation for single tables
  • Supply a custom synthesizer
  • Benchmark your own synthetic data generation techniques
  • Customize your datasets
  • The SDGym library includes many publicly available datasets that you can include right away
  • You can also include any custom, private datasets that are stored on your computer on an Amazon S3 bucket

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow SDGym

SDGym Web Site

Other Useful Business Software
Ship AI Apps Faster with Vertex AI Icon
Ship AI Apps Faster with Vertex AI

Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
Try Vertex AI Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of SDGym!