Главная
Study mode:
on
1
Key ideas of the paper
2
Abstract
3
Note on k-NN non-parametric machine learning
4
Data and NPT setup explained
5
NPT loss is inspired by BERT
6
A high-level architecture overview
7
NPT jointly learns imputation and prediction
8
Architecture deep dive input embeddings, etc
9
More details on the stochastic masking loss
10
Connections to Graph Neural Networks and CNNs
11
NPT achieves great results on tabular data benchmarks
12
NPT learns the underlying relational, causal mechanisms
13
NPT does rely on other datapoints
14
NPT attends to similar vectors
15
Conclusions
Description:
Dive deep into the world of Non-Parametric Transformers with this comprehensive 46-minute video lecture. Explore the key concepts from the paper "Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning". Learn about the NPT architecture, its connections to BERT, Graph Neural Networks, and CNNs, and understand how it achieves impressive results on tabular data benchmarks. Discover how NPT learns underlying relational and causal mechanisms, and examine its ability to attend to similar vectors. Gain valuable insights into this innovative approach to machine learning through detailed explanations and visual aids.

Non-Parametric Transformers - Paper Explained

Aleksa Gordić - The AI Epiphany
Add to list
0:00 / 0:00