Главная
Study mode:
on
1
Intro
2
Outline
3
Text-to-speech as sequence-to-sequence mapping
4
Speech production process
5
Typical flow of TTS system
6
Speech synthesis approaches
7
Probabilistic formulation of TTS
8
Approximation (2)
9
Representation - Linguistic features
10
Representation - Acoustic features
11
Representation - Mapping
12
HMM-based generative acoustic model for TTS
13
Alternative acoustic model
14
FFNN-based acoustic model for TTS [6]
15
NN-based generative acoustic model for TTS
16
NN-based generative model for TTS
17
Learned features
18
WaveNet: A generative model for raw audio
19
WaveNet - Causal dilated convolution
20
WaveNet - Architecture
21
Softmax
22
WaveNet vs conventional audio generative models
23
Relax approximation
24
Generative model-based text-to-speech synthesis
25
Beyond text-to-speech synthesis
26
Beyond generative TTS
Description:
Explore the latest advancements in generative model-based approaches for speech synthesis in this 38-minute conference talk by Heiga Zen from Google. Gain insights into the significant improvements in synthesized speech naturalness, learn about the probabilistic formulation of text-to-speech systems, and discover various acoustic models including HMM-based, FFNN-based, and NN-based generative models. Delve into the architecture of WaveNet, a groundbreaking generative model for raw audio, and understand its advantages over conventional audio generative models. Examine the potential future directions in text-to-speech synthesis and its applications beyond traditional boundaries.

Generative Model-Based Text-to-Speech Synthesis

MITCBMM
Add to list