![[R] Reinforcement Learning for Sequential Decision and Optimal Control](https://jfbmhhfxbbrxcmwilqxt.supabase.co/storage/v1/object/public/resource-images/MachineLearning_AI_implementation_for_beginners_20250328_190030_processed_image.jpg)
[R] Reinforcement Learning for Sequential Decision and Optimal Control
Since early 21st century, artificial intelligence (AI) has been reshaping almost all areas of human society, which has high potential to spark the fourth industrial revolution. Notable examples can be found in the sector of road transportation, where AI has drastically changed automobile design and traffic management. Many new technologies, such as driver assistance, autonomous driving, and cloud-based cooperation, are emerging at an unbelievable speed. These new technologies have the potential to significantly improve driving ability, reduce traffic accidents, and relieve urban congestion.
As one of the most important AI branches, reinforcement learning (RL) has attracted increasing attention in recent years. RL is an interdisciplinary field of trial-and-error learning and optimal control, which promises to provide optimal solutions for decision-making or control in large-scale and complex dynamic processes. One of its most conspicuous successes is AlphaGo from Google DeepMind, which has beaten the highest-level professional human player. The underlying key technology is the so-called deep reinforcement learning, which equips AlphaGo with amazing self-evolution ability and high playing intelligence.
Despite a few successes, the application of RL is still in its infancy because most RL algorithms are rather difficult to comprehend and implement. RL connects deeply with statistical learning and convex optimization, and involves a wide range of new concepts and theories. As a beginner, one must undergo a long and tedious learning process to become an RL master. Without fully understanding those underlying principles, it is very difficult for new users to make proper adjustments to achieve the best application performance.
https://preview.redd.it/tggt6o3o481c1.jpg?width=248&format=pjpg&auto=webp&s=75e2b58ac8da9273f2511a4fe37ef508d86a6e96
Reference:
Shengbo Eben Li, Reinforcement Learning for Sequential Decision and Optimal Control. Springer Verlag, Singapore, 2023
Website of e-book:
https://link.springer.com/book/10.1007/978-981-19-7784-8
Book contents
This book aims to provide a systematic introduction to fundamental RL theories, mainstream RL algorithms and typical RL applications for researchers and engineers. The main topics include Markov decision processes, Monte Carlo learning, temporal difference learning, RL with function approximation, policy gradient method, approximate dynamic programming, deep reinforcement learning, etc.
- Chapter 1 provides an overview of RL, including its history, famous scholars, successful examples and up-to-date challenges.
- Chapter 2 discusses the basis of RL, including main concepts and terminologies, Bellman’s optimality condition, and general problem formulation.
- Chapter 3 introduces Monte Carlo learning methods for model-free RL, including on-policy/off-policy methods and importance sampling technique.
- Chapter 4 introduces temporal difference learning methods for model-free RL, including Sarsa, Q-learning, and expected Sarsa.
- Chapter 5 introduces stochastic dynamic programming (DP), i.e., model-based RL with tabular representation, including value iteration DP, policy iteration DP and their convergence mechanisms.
- Chapter 6 introduces how to approximate policy and value functions in indirect RL methods as well as the associated actor-critic architecture.
- Chapter 7 derives different kinds of direct policy gradients, including likelihood ratio gradient, natural policy gradient and a few advanced variants.
- Chapter 8 introduces infinite-horizon ADP, finite-horizon ADP and its connection with model predictive control.
- Chapter 9 discusses how to handle state constraints and its connection with feasibility and safety, as well as the newly proposed actor-critic-scenery learning architecture.
- Chapter 10 is devoted to deep reinforcement learning, including how to train artificial neural networks and typical deep RL algorithms such as DQN, DDPG, TD3, TRPO, PPO, SAC, and DSAC.
- Chapter 11 provides various RL topics,including robust RL, POMDP, multi-agent RL, meta-RL, inverse RL, offline RL, major RL libraries and platforms.
Author information:
Shengbo Eben Li is currently a professor at Tsinghua University in the interdisciplinary field of autonomous driving and artificial intelligence. Before joining Tsinghua University, he has worked at Stanford University, University of Michigan, and UC Berkeley. His active research interests include intelligent vehicles and driver assistance, deep reinforcement learning, optimal control and estimation, etc. He has published more than 130 peer-reviewed papers in top-tier international journals and conferences. He is the recipient of best paper awards (finalists) of IEEE ITSC, ICCAS, IEEE ICUS, IEEE IV, L4DC, etc. He has received a number of important academic honors, including National Award for Technological Invention of China (2013), National Award for Progress in Sci & Tech of China (2018), Distinguished Young Scholar of Beijing NSF (2018), Youth Sci & Tech Innovation Leader from MOST China (2020), etc. He also serves as Board of Governor of IEEE ITS Society, Senior AE of IEEE OJ ITS, and AEs of IEEE ITSM, IEEE Trans ITS, Automotive Innovation, etc.
Vibe Score

0
Sentiment

1
Rate this Resource
Join the VibeBuilders.ai Newsletter
The newsletter helps digital entrepreneurs how to harness AI to build your own assets for your funnel & ecosystem without bloating your subscription costs.
Start the free 5-day AI Captain's Command Line Bootcamp when you sign up: