Finite time analysis of temporal difference learning with linear function approximation
Description:
Watch a technical lecture exploring the finite-time analysis of temporal difference (TD) learning with linear function approximation, delivered by Prof. Prashanth L.A. from IIT Madras. Dive deep into the mathematical analysis of tail-averaged TD learning, examining how it achieves optimal O(1/t) convergence rates both in expectation and with high probability. Learn about a novel step-size selection approach that doesn't require eigenvalue information from the projected TD fixed point matrix. Discover how tail-averaging improves the decay rate of initial error compared to full-iterate averaging, and explore a regularized TD variant that performs well with ill-conditioned features. The lecture draws from research accepted at AISTATS 2023, presented by an accomplished researcher whose work spans reinforcement learning, simulation optimization, and multi-armed bandits, with applications across transportation systems, wireless networks, and recommendation systems.
Finite Time Analysis of Temporal Difference Learning with Linear Function Approximation: Tail Averaging and Regularization