Главная
Study mode:
on
1
Infinite context length of LLM
2
INFINI paper by Google
3
Matrix Memory of limited size
4
Update memory simple
5
Retrieve memory simple
6
Update memory maths
7
Retrieve memory maths
8
Infini attention w/ internal RAG?
9
Benchmark data
10
Summary for green grasshoppers
11
TransformerFAM w/ Feedback attention
Description:
Explore a technical video presentation detailing Google's innovative Infini-attention transformer architecture, designed to handle context lengths of up to 1 million tokens. Learn about the integration of compressive memory components within vanilla attention mechanisms, allowing models to store and retrieve historical key-value states efficiently. Understand the technical challenges and solutions around information compression, implementation complexity, and performance optimization. Dive into detailed mathematical explanations of memory updates and retrieval processes, benchmark data analysis, and explore the relationship between Infini-attention and internal RAG systems. The presentation concludes with insights into TransformerFAM with Feedback attention and includes a simplified summary for beginners. Based on the research paper "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention," this comprehensive breakdown covers everything from basic concepts to advanced mathematical implementations. Read more

INFINI Attention: Efficient Infinite Context Transformers with 1 Million Token Context Length

Discover AI
Add to list
0:00 / 0:00