Главная
Study mode:
on
1
- Intro & Overview
2
- Recap: The T5 model
3
- The ExT5 model and task formulations
4
- ExMix dataset
5
- Do different tasks help each other?
6
- Which tasks should we include?
7
- Pre-Training vs Pre-Finetuning
8
- A few hypotheses about what's going on
9
- How much self-supervised data to use?
10
- More experimental results
11
- Conclusion & Summary
Description:
Explore an in-depth analysis of the ExT5 model, which pushes the limits of T5 by pre-training on 107 different supervised NLP tasks using the ExMix dataset. Learn about the model's architecture, task formulations, and performance compared to T5 baselines. Discover insights on multi-task scaling, co-training transfer among task families, and the impact of self-supervised data in pre-training. Gain understanding of the ExT5's improved performance on various NLP tasks and its enhanced sample efficiency during pre-training. This comprehensive video covers topics such as task selection, pre-training vs. pre-finetuning, and experimental results, providing valuable insights for researchers and practitioners in the field of natural language processing and transfer learning.

Towards Extreme Multi-Task Scaling for Transfer Learning

Yannic Kilcher
Add to list
0:00 / 0:00