Главная
Study mode:
on
1
Introduction
2
About me
3
Outline
4
Why multilingual data
5
Tasks associated with language systems
6
Syntax mixing
7
Transliterated text
8
Language identification
9
Language identification in practice
10
Other examples
11
Lambda ID
12
Blanked
13
Python
14
Limitations
15
Data augmentation
16
Simple example
17
The Transformer
18
Multiheaded attention
19
Stateoftheart soda
20
Why is it special
21
Word Piece Processing
22
Statistics of Languages
23
Bird Masked Language Model
24
Prediction Function
25
Code Switched Example
26
Lyrics Example
27
Task Evaluation
28
Generation Evaluation
29
Summary
Description:
Explore the challenges and solutions for multilingual Natural Language Processing (NLP) models in this 45-minute PyCon US talk by Shreya Khurana. Dive into the complexities of language identification, transliterated and code-switched text, and the use of multilingual BERT models. Learn about existing Python frameworks for language identification tasks and their limitations. Discover approaches to handling the lack of annotated datasets for transliterated and code-switched text using web crawlers and self-generated datasets. Examine the performance of Google's multilingual BERT model trained in 104 languages through practical examples. Gain insights into evaluating NLP models for various tasks in a multilingual context. Access additional resources and code examples on GitHub to further enhance your understanding of multilingual NLP techniques.

How Multilingual Is Your NLP Model?

PyCon US
Add to list
0:00 / 0:00