Главная
Study mode:
on
1
Recording Start
2
Lecture starts
3
Course Materials Copyright
4
Announcements
5
Choosing k for minhashing motivation
6
PAC
7
Central Limit Theorem
8
Chernoff-Hoeffding Inequality
9
Choosing k for a good estimate of JS
10
Recording ends
Description:
Learn advanced data mining concepts in this 21-minute lecture focusing on Min Hashing techniques and statistical foundations. Explore the mathematical principles behind choosing optimal k values for Min Hashing, including Probably Approximately Correct (PAC) learning, Central Limit Theorem, and Chernoff-Hoeffding Inequality. Master the application of these theoretical concepts to obtain accurate Jaccard Similarity estimates through Min Hashing. Delve into practical implementations while understanding the statistical guarantees that make Min Hashing a powerful technique in data mining applications.

Data Mining: Choosing k for MinHashing - Spring 2023

UofU Data Science
Add to list
0:00 / 0:00