Play all

Intro

Why Care About Low-Resource Speech Processing?

How Much Transcribed Audio Do We Need?

Why Do We Need All That Training Data?

Multilingual Features

The IARPA Babel Program

Babel Languages

Limited resources

What is keyword search, and why focus on it?

How do we measure keyword search performance?

Properties of term-weighted value

Take-Home Messages

Three Ways of Looking at Speech

Deep Neural Network

A Stacked DNN Architecture

Convolutional Neural Network

Considered 2 CNN Architectures

Recurrent Neural Network

Bidirectional LSTM Architecture

Three Use Cases

More Expressive Architectures Make a Big Difference

Fixed Features Allow for Rapid Development

Our partners

Babel resources

Description:

Explore multilingual speech representations for low-resource speech processing in this 40-minute talk by Brian Kingsbury from IBM. Discover how to achieve good automatic speech recognition performance with limited data for thousands of languages worldwide. Learn about the IARPA Babel Program, keyword search techniques, and various neural network architectures including Deep Neural Networks, Convolutional Neural Networks, and Recurrent Neural Networks. Understand the challenges and solutions for processing languages with limited resources, and gain insights into the importance of multilingual features in reducing the amount of data needed for training speech recognition systems in new languages. Examine three use cases and learn how more expressive architectures significantly impact performance. Ideal for researchers and professionals interested in advancing speech processing technologies for low-resource languages.

Multilingual Representations for Low-Resource Speech Processing

MITCBMM

Add to list

#Computer Science #Artificial Intelligence #Speech Recognition #Machine Learning #Neural Networks #Neural Network Architecture #Deep Learning #Deep Neural Networks

0:00 / 0:00