Play all

Multiarmed bandits

Exploration exploitation

Stochastic bandits

Bandits from gambling

Bandits in practice

Online optimization

Simplified version

The problem

Heuristics

Notion of regret

Epsilon greedy strategy

Single state

Epsilon greedy

Different approaches

In practice

Description:

Explore the fascinating world of multi-armed bandits in this comprehensive 57-minute lecture by Pascal Poupart. Delve into key concepts such as exploration-exploitation trade-offs, stochastic bandits, and online optimization. Learn about the origins of bandits in gambling and their practical applications. Understand the simplified version of the problem, various heuristics, and the notion of regret. Discover the epsilon-greedy strategy and its implementation in single-state scenarios. Gain insights into different approaches and their effectiveness in real-world situations.

CS885: Multi-Armed Bandits

Pascal Poupart

Add to list

#Computer Science #Machine Learning #Reinforcement Learning #Artificial Intelligence #Heuristics #Decision-Making Algorithms #Multi-Armed Bandits

0:00 / 0:00