Главная
Study mode:
on
1
Intro Green grasshoppers
2
What do attention heads focus on?
3
Long context Factuality by retrieval heads
4
Needle in a Haystack Benchmark
5
How many retrieval heads in a LLM?
6
What is a retrieval head?
7
Retrieval heatmap consistent with pre-trained base model
8
Retrieval heads and Chain-of-Thought Reasoning
9
Retrieval heads explain why LLMs hallucinate
10
How to generate more retrieval heads in LLMs?
Description:
Learn about groundbreaking research from MIT and Peking University in this 31-minute video exploring retrieval heads - a newly discovered element in transformer architecture. Dive into how these specialized attention heads significantly impact RAG performance, chain-of-thought reasoning, and long context length retrieval quality. Explore practical applications including enhanced RAG systems development, improved long context windows retrieval, and reduced factual hallucination in LLMs. Examine five detailed real-world use cases spanning legal document summarization, financial market analysis, educational technology, content moderation, and scientific research. Follow along with a structured breakdown of topics including attention head focus areas, factuality in long context, the Needle in a Haystack benchmark, retrieval head mechanics, heatmap analysis, and their relationship to Chain-of-Thought reasoning. Based on the arxiv pre-print "Retrieval Head Mechanistically Explains Long-Context Factuality" by Wenhao Wu and colleagues, gain insights into why LLMs hallucinate and methods for generating additional retrieval heads in language models. Read more

Understanding Retrieval Heads in Large Language Models - From Discovery to Applications

Discover AI
Add to list
0:00 / 0:00