Главная
Study mode:
on
1
OpenAI o1 type techniques for scaling test time compute
2
Video Overview temperature, chain of thought
3
Training compute versus test time compute
4
Why spend more compute on test time / inference?
5
Using verifiers to select the best answers
6
Exploring and critiquing/verifying answers during inference
7
Understanding Temperature for sampling
8
Should you set temperature to zero?
9
Beam search
10
Problems with setting a non-zero temperature
11
Using top p, top k, min p, and best of
12
Recap on choosing temperature for sampling
13
How to implement chain of thought
14
Setup for notebook run-through on gsm8k and hotpot qa
15
Using sampling and chain of thought on hotpotqa and gsm8k
16
Running vllm in a Jupyter notebook allows for batching
17
Scoring / Grading with OpenAI gpt4o-mini using regex enforcement
18
Multi-threading the scoring / grading for speed
19
Running the dataset multiple times to get the mean and mean absolute deviation of correct answers
20
Controlling sampling parameters min p, top p, top k, beam search, temperature
21
Running temperature / sampling ablations WITHOUT chain of thought
22
Chain of Thought Setup
23
Running ablations WITH chain of thought
24
GSM8K Results Charts
25
Hotpot QA Results Charts
26
Recommendations on sampling, temperature and chain of thought
27
Video resources
Description:
Explore advanced inference techniques for large language models in this comprehensive 55-minute video lecture. Dive into the differences between training and test-time compute, and learn why investing more resources in inference can be beneficial. Discover how to use verifiers for selecting optimal answers and explore methods for critiquing responses during inference. Gain a deep understanding of temperature in sampling, including when to use zero or non-zero values, and explore alternatives like beam search, top-p, top-k, and min-p sampling. Learn to implement chain-of-thought reasoning and see practical demonstrations using datasets like GSM8K and HotpotQA. Follow along with notebook run-throughs, learn to use VLLM for efficient batching, and discover techniques for scoring and grading responses. Analyze the results of various sampling and chain-of-thought configurations through detailed charts, and receive expert recommendations on optimizing these parameters for different tasks.

Test Time Compute: Sampling and Chain of Thought Techniques

Trelis Research
Add to list
0:00 / 0:00