Play all

Infinite context length of LLM

INFINI paper by Google

Matrix Memory of limited size

Update memory simple

Retrieve memory simple

Update memory maths

Retrieve memory maths

Infini attention w/ internal RAG?

Benchmark data

Summary for green grasshoppers

TransformerFAM w/ Feedback attention

Description:

Explore a technical video presentation detailing Google's innovative Infini-attention transformer architecture, designed to handle context lengths of up to 1 million tokens. Learn about the integration of compressive memory components within vanilla attention mechanisms, allowing models to store and retrieve historical key-value states efficiently. Understand the technical challenges and solutions around information compression, implementation complexity, and performance optimization. Dive into detailed mathematical explanations of memory updates and retrieval processes, benchmark data analysis, and explore the relationship between Infini-attention and internal RAG systems. The presentation concludes with insights into TransformerFAM with Feedback attention and includes a simplified summary for beginners. Based on the research paper "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention," this comprehensive breakdown covers everything from basic concepts to advanced mathematical implementations. Read more

INFINI Attention: Efficient Infinite Context Transformers with 1 Million Token Context Length

Discover AI

Add to list

#Computer Science #Machine Learning #Transformers #Artificial Intelligence #Neural Networks #Data Science #Information Retrieval #Deep Learning #Attention Mechanisms

0:00 / 0:00