Главная
Study mode:
on
1
Intro
2
Why is text-to-audio hard?
3
Comparison with VQ-GAN
4
Comparison with SoundStream
5
AudioGen overview
6
Deep dive: audio representation, LSTM
7
Losses explained
8
Complex-valued STFTs
9
Audio Language Modeling
10
Multi-stream audio inputs
11
Data and augmentations
12
Results
13
Outro
Description:
Dive deep into the world of text-guided audio synthesis with this comprehensive video explanation of the "AudioGen: Textually Guided Audio Generation" paper. Explore the challenges of text-to-audio conversion, compare AudioGen with VQ-GAN and SoundStream, and gain insights into audio representation, LSTM networks, and complex-valued STFTs. Learn about audio language modeling, multi-stream audio inputs, data augmentation techniques, and examine the impressive results of this innovative approach to audio generation.

AudioGen- Textually Guided Audio Generation - Paper Explained

Aleksa Gordić - The AI Epiphany
Add to list