The Solution: Noise-Aware Multiple Imputation (NA+MI)
8
Rubin's Rules
9
The Bayesian Model - Variables
10
Results - Toy Example
11
Results - UCI Adult Dataset
12
Results - Marginal Accuracy on Adult
13
Limitations
14
Conclusion
Description:
Explore the challenges and solutions in analyzing differentially private synthetic data in this 27-minute conference talk by Ossi Räisä from the Finnish Center for Artificial Intelligence. Delve into the innovative Noise-Aware Multiple Imputation (NA+MI) pipeline, which combines synthetic data analysis techniques from multiple imputation with noise-aware Bayesian modeling. Discover how this approach addresses the issue of invalid inferences when analyzing differentially private synthetic data as if it were real. Learn about the novel NAPSU-MQ algorithm for discrete data generation using marginal queries, based on the principle of maximum entropy. Examine experimental results demonstrating the pipeline's ability to produce accurate confidence intervals from differentially private synthetic data, accounting for additional uncertainty from privacy noise. Gain insights into the limitations of this approach and its potential implications for privacy-preserving machine learning.
Noise-Aware Statistical Inference with Differentially Private Synthetic Data