Play all

- Tutorial Starts

- Topic Modeling Intro

- Workshop Environment

- Content location at GitHub

- Dataset used in this workshop

- LDA Intro

- Topic Modeling Use Cases

- 6 Steps in this Workshop

- Step 1: Loading Data

- Step 2: Data Preparation

- Step 2.1: Removing Punctuation

- Step 2.2: Removing digits and word with digits

- Step 2.3: Lowercase all context

- Step 3: EDA

- Step 3.1: Word Cloud

- Step 3.2: Document Term Matrix

- Step 4: Data Modeling

- Step 4.1: Stop words removal

- Step 4.2: Creating Bigram and Trigram

- Step 4.3: Lemmatization

- Step 4.4: Tokenization

- Step 5: LDA Topic Modeling

- Step 6: Topic Modeling Performance and analysis

- Step 6.1: Topic visualization

- Step 6.2: Coherence Score

- Saving notebook to GitHub

- Recap

Description:

Dive into a comprehensive 42-minute workshop on topic modeling in Python, combining Gensim, spaCy, NLTK, and other libraries. Learn to process NIPS papers through six key steps: data loading, preparation, exploratory analysis, modeling and tokenization, LDA model building, and evaluation. Master techniques like punctuation removal, word cloud creation, stop word elimination, bigram and trigram generation, lemmatization, and tokenization. Visualize topics and calculate coherence scores to assess model performance. Access the accompanying GitHub notebook for hands-on practice and follow along with detailed time stamps for each section of the tutorial.

Topic Modeling Workshop for Beginners in Python

Prodramp

Add to list

#Computer Science #Machine Learning #Topic Modeling #Data Science #Data Analysis #Programming #Programming Languages #Python #Data Cleaning #Natural Language Toolkit (NLTK) #spaCy