Play all

Intro

k-means clustering

The online setting

The Goal(s) Cohen-Addad et. al. 2021

A troubling example

Lower Bound

A natural starting point: streaming

A difficult case for streaming

Idea 1: don't remove centers

Proof Sketch

We still have problems on pathological examples.

Idea 2: Using the scale to delete points.

The Lemma Revisited

Our Algorithm's Performance

Proof idea

Future Directions

Description:

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Explore online k-means clustering for arbitrary data streams in this 37-minute USC Probability and Statistics Seminar talk by Robi Bhattacharjee. Delve into a novel approach for achieving clustering loss that is comparable to the best possible loss using k fixed points in hindsight. Learn about a proposed data parameter, Λ(X), and its implications for algorithm performance. Discover a randomized algorithm that achieves O(Λ(X)+L(X,OPTk)) clustering loss while maintaining O(kpoly(logn)) memory and cluster centers. Understand how this algorithm achieves polynomial space and time complexity without making assumptions on input data. Follow the presentation through key concepts including the online setting, lower bounds, streaming challenges, and innovative ideas for center management and point deletion. Gain insights into future directions for research in this field.

Online k-means Clustering on Arbitrary Data Streams - Lecture

USC Probability and Statistics Seminar

Add to list

#Computer Science #Machine Learning #K-Means Clustering #Data Science #Data Analysis #Algorithms #Randomized Algorithms #Time Complexity #Space Complexity #Approximation Algorithms #Computer Graphics #Computational Geometry

0:00 / 0:00