Tofu is a Python library for generating synthetic UK Biobank data. The UK Biobank is a large open-access prospective research cohort study of 500,000 middle-aged participants recruited in England, Scotland and Wales. The study has collected and continues to collect extensive phenotypic and genotypic detail about its participants, including data from questionnaires, physical measures, sample assays, accelerometry, multimodal imaging, genome-wide genotyping and longitudinal follow-up for a wide range of health-related outcomes. Tofu will generate synthetic data which conforms to the structure of the baseline data UK Biobank sends researchers by generating random values. For categorical variables (single or multiple choices), a random value will be picked from the UK Biobank data dictionary for that field. For continuous variables, a random value will be generated based on the distribution of values reported for that field on the UK Biobank showcase.

Features

  • For categorical variables (single or multiple choices), a random value will be picked from the UK Biobank data dictionary for that field
  • For continous variables, a random value will be generated based on the distribution of values reported for that field on the UK Biobank showcase
  • For date and date/time fields, a random date will be generated
  • For all other fields, such as polymorphic fields, no data will be generated
  • The lookups directory contains lookups downloaded from the UK Biobank showcase
  • Data conform to the structure and schema of the baseline file but are otherwise nonsensical: no checks have been implemented across fields

Project Samples

Project Activity

See All Activity >

Follow Tofu

Tofu Web Site

Other Useful Business Software
Go From Idea to Deployed AI App Fast Icon
Go From Idea to Deployed AI App Fast

One platform to build, fine-tune, and deploy. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Tofu!

Additional Project Details

Programming Language

Python

Related Categories

Python Synthetic Data Generation Software

Registered

2023-05-22