Play all

Intro

Outline

Agenda

Bird Paper

Architecture

Problems

Adapters

Modularity

Compositions

Overview

Function Composition

Input Composition

Parameter Composition

Fusion

Hyper Networks

Shared Hyper Networks

Chad GP

Questions

Description:

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Learn about multi-task learning in transformer-based NLP architectures through this 31-minute conference talk that explores cost-effective alternatives to training separate models. Discover how leveraging information across multiple tasks and datasets can enhance performance through shared models, representation bias, increased data efficiency, and eavesdropping. Explore solutions to challenges like catastrophic forgetting and interference, while diving into general approaches to multi-task learning, innovative adapter-based techniques, hypernetwork methods, and strategies for task sampling and balancing. The presentation covers key topics including the Bird Paper, architecture considerations, modularity concepts, function composition, input composition, parameter composition, fusion techniques, and shared hypernetworks, concluding with insights into Chad GP implementations.

Multi-Task Learning in Transformer-Based Architectures for Natural Language Processing

Data Science Conference

Add to list

#Computer Science #Machine Learning #Multi-Task Learning #Artificial Intelligence #Neural Networks #Transfer Learning #Deep Learning #Transformer Architecture #Catastrophic Forgetting #Hypernetworks

0:00 / 0:00