USENIX Security '23 - Calpric: Inclusive and Fine-grain Labeling of Privacy Policies with...
Description:
Explore a conference talk from USENIX Security '23 that introduces Calpric, an innovative approach to labeling privacy policies using crowdsourcing and active learning. Learn how this method combines automatic text selection and segmentation with crowdsourced annotators to generate a large, balanced training set for privacy policies at a reduced cost. Discover how Calpric enables the creation of more accurate deep learning models that cover a wider range of data categories and provide more detailed, fine-grain labels than previous work. Understand the cost-effectiveness of this approach, which achieves reliable labeled data at approximately $0.92-$1.71 per text segment, and produces a labeled dataset of 16K privacy policy text segments across 9 data categories with balanced positive and negative samples.
Calpric - Inclusive and Fine-grain Labeling of Privacy Policies with Crowdsourcing and Active Learning