Главная
Study mode:
on
1
Introduction
2
Naive Bayes: A Little History
3
Naive Bayes: Advantages and Disavantages
4
About the dataset: YouTube Spam Collection
5
Pre-requisites
6
Naive Bayes: An example
7
Naive Bayes: The Equation
8
Loading the Dataset
9
Train/test split
10
Feature extraction: Bag of words approach
11
Bag of words approach-Training
12
Bag of Words approach-Testing and Evaluation
13
Feature Extraction: TF-IDF Approach
14
TF-IDF Approach: Training
15
TF-IDF Approach: Testing and Evaluation
16
Tuning parameters: Laplace smoothing
Description:
Explore a 30-minute EuroPython Conference talk on building a Naive Bayes text classifier using scikit-learn. Learn about the algorithm's simplicity and effectiveness in classifying large, sparse datasets like text documents. Discover preprocessing techniques such as text normalization and feature extraction. Follow along as the speaker demonstrates model construction using the spam/ham YouTube comment dataset from the UCI repository. Gain insights into the Naive Bayes algorithm's history, advantages, and disadvantages. Dive into practical examples, equations, and implementation steps, including dataset loading, train/test splitting, and feature extraction using bag-of-words and TF-IDF approaches. Conclude with techniques for model evaluation and parameter tuning through Laplace smoothing.

Building a Naive Bayes Text Classifier with scikit-learn

EuroPython Conference
Add to list
0:00 / 0:00