Tutorials
The tutorial repository provides walkthroughs for exploring, evaluating, and validating synthetic data. Tutorials can be executed locally by cloning the repository and running the notebooks in Jupyter Lab, or accessed via Google Colab to run in a managed cloud environment. Each tutorial demonstrates a distinct capability of the synthetic data platform.
Tutorial | Colab Link | GitHub Link |
---|---|---|
Getting started with the SDK | View Notebook | |
Validate synthetic data via Train-Synthetic-Test-Real | View Notebook | |
Explore the size vs. accuracy trade-off for synthetic data | View Notebook | |
Differentially private synthetic data | View Notebook | |
Rebalance synthetic datasets for data augmentation | View Notebook | |
Conditionally generate synthetic (geo) data | View Notebook | |
Explain AI with synthetic data | View Notebook | |
Generate fair synthetic data | View Notebook | |
Generate synthetic text via a fast LSTM model trained from scratch | View Notebook | |
Generate synthetic text via a pre-trained Large Language Model | View Notebook | |
Perform multi-table synthesis | View Notebook | |
Analyse star-schema correlations | View Notebook | |
Develop a fake or real discriminator with Synthetic Data | View Notebook | |
Close gaps in your data with Smart Imputation | View Notebook | |
Calculate accuracy and privacy metrics for Quality Assurance | View Notebook | |
Enrich Sensitive Data with LLMs using Synthetic Replicas | View Notebook | |
MOSTLY AI vs. SDV comparison: single-table scenario | View Notebook | |
MOSTLY AI vs. SDV comparison: sequential scenario | View Notebook |