Introduction to Reinforcement Learning - Multi-armed bandits - Policy Gradient Methods - Contextual Bandits - Finite Markov Decision Process - Dynamic Programming - Policy Iteration - Value Iteration - Monte Carlo Methods - Temporal Difference Learning - n-step bootstrapping - Eligibility Traces - Model-based RL - Planning - On-policy prediction with function approximation - on-policy control with function approximation - off-policy control with function approximation - Hierarchical RL - POMDPs - inverse-RL - Exploration in RL - Offline RL.
- Responsable du site: Sarath Chandar Anbil Parthipan
- Enseignant (éditeur): Xutong Zhao