Play all

Introduction

The Problem

Probabilistic Data Structures

Hyperlog Log

Hyperlog Log Algorithm

Hyperlog Log Example

Bloom Filter

Python Code

When to Use

Description:

Explore probabilistic data structures in Python for efficient handling of large-scale data in this PyCon US talk. Discover how to count distinct items from a data firehose and determine if an item has been seen before, while balancing accuracy with speed and resource efficiency. Learn about the Hyperloglog and Bloom Filter, their high-level functioning, and practical applications in Python. Gain insights into scenarios where absolute accuracy may be impractical and how these structures provide fast, scalable solutions for problems like counting social media likes or tracking user interactions on websites. Access the accompanying GitHub repository and slides for hands-on examples and further study.

No, Maybe and Close Enough - Using Probabilistic Data Structures in Python

PyCon US

Add to list

#Conference Talks #PyCon US #Programming #Programming Languages #Python #Data Science #Data Processing #Computer Science #Data Structures #Bloom Filters #Probabilistic Data Structures