Play all

3 videos on infinity context length

Visualization of new transformerFAM

Pseudocode for two new transformer

Basics of Attention calculations

TransformerBSWA - Block Sliding Window Attention

TransformerFAM - Feedback Attention Memory

Symmetries in operational feedback code

Time series visualization of new FAM and BSWA

Outlook on Reasoning w/ TransformerFAM

Description:

Explore a 25-minute video presentation on cutting-edge AI research that delves into Google's innovative TransformerFAM (Feedback Attention Memory) and BSWA (Block Sliding Window Attention) architectures. Learn how these novel designs enhance traditional Transformers by incorporating feedback mechanisms that simulate working memory, based on ring attention research from UC Berkeley. Understand the technical breakthrough that enables processing of indefinitely long sequences with linear complexity, rather than the quadratic complexity of traditional Transformers. Master the seamless integration capabilities with existing pretrained models, requiring no additional weights while effectively managing long-term dependencies through feedback loops. Follow along with detailed visualizations, pseudocode explanations, and practical demonstrations, from basic attention calculations to operational feedback code symmetries and time series visualizations. Gain insights into how these architectures mimic biological neural networks and their potential impact on AI reasoning capabilities, all supported by comprehensive research published in arXiv. Read more

TransformerFAM and BSWA: Understanding Feedback Attention Memory and Block Sliding Window Attention

Discover AI

Add to list

#Computer Science #Machine Learning #Transformers #Artificial Intelligence #Neural Networks #Social Sciences #Psychology #Cognitive Psychology #Working Memory #Deep Learning #Attention Mechanisms

0:00 / 0:00