Explore techniques for protecting sensitive data in large datasets using cloud tools in this 46-minute conference talk. Learn to identify personally identifiable information (PII) in massive datasets, understand concepts like k-anonymity and l-diversity, and discover practical options for data protection such as removing, masking, and coarsening. Gain hands-on experience through real-life demonstrations on massive datasets, and discover newly available tools for PII detection. Delve into topics including Cloud DLP, BigQuery, tokenization, encryption, and differential privacy. Understand best practices for sharing public datasets while maintaining individual privacy, and learn how to automate anonymity measures. Acquire valuable insights on balancing data utility with protection of individuals when releasing public datasets.
Protecting Sensitive Data in Huge Datasets - Cloud Tools You Can Use