• Genere: Libro
  • Lingua: Inglese
  • Editore: Springer
  • Pubblicazione: 06/2022
  • Edizione: 1st ed. 2022

Deep Reinforcement Learning

54,98 €
52,23 €
AGGIUNGI AL CARRELLO
TRAMA
Deep reinforcement learning has attracted considerable attention recently. Impressive results have been achieved in such diverse fields as autonomous driving, game playing, molecular recombination, and robotics. In all these fields, computer programs have taught themselves to understand problems that were previously considered to be very difficult. In the game of Go, the program AlphaGo has even learned to outmatch three of the world’s leading players.Deep reinforcement learning takes its inspiration from the fields of biology and psychology. Biology has inspired the creation of artificial neural networks and deep learning, while psychology studies how animals and humans learn, and how subjects’ desired behavior can be reinforced with positive and negative stimuli. When we see how reinforcement learning teaches a simulated robot to walk, we are reminded of how children learn, through playful exploration. Techniques that are inspired by biology and psychology work amazingly well in computers: animal behavior and the structure of the brain as new blueprints for science and engineering. In fact, computers truly seem to possess aspects of human behavior; as such, this field goes to the heart of the dream of artificial intelligence. These research advances have not gone unnoticed by educators. Many universities have begun offering courses on the subject of deep reinforcement learning. The aim of this book is to provide an overview of the field, at the proper level of detail for a graduate course in artificial intelligence. It covers the complete field, from the basic algorithms of Deep Q-learning, to advanced topics such as multi-agent reinforcement learning and meta learning.

SOMMARIO
Contents1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 What is Deep Reinforcement Learning? . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Three Machine Learning Paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.3 Overview of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Tabular Value-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.1 Sequential Decision Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2 Tabular Value-Based Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.3 Classic Gym Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603 Approximating the Value Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.1 Large, High-Dimensional, Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663.2 Deep Value-Based Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.3 Atari 2600 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874 Policy-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894.1 Continuous Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914.2 Policy-Based Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944.3 Locomotion and Visuo-Motor Environments . . . . . . . . . . . . . . . . . . . . 1114.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1154.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165 Model-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1195.1 Dynamics Models of High-Dimensional Problems . . . . . . . . . . . . . . . 1225.2 Learning and Planning Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1235.3 High-dimensional Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1365.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142viiviii CONTENTS5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1446 Two-Agent Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1476.1 Two-Agent Zero-Sum Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1506.2 Tabula Rasa Self-Play Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1566.3 Self-Play Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1786.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1866.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1887 Multi-Agent Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1917.1 Multi-Agent Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1937.2 Multi-Agent Reinforcement Learning Agents . . . . . . . . . . . . . . . . . . . . 2027.3 Multi-Agent Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2147.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2217.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2238 Hierarchical Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 2258.1 Granularity of the Structure of Problems . . . . . . . . . . . . . . . . . . . . . . . 2278.2 Divide and Conquer for Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2298.3 Hierarchical Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2358.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2408.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2419 Meta Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2439.1 Learning to Learn Related Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2469.2 Transfer Learning and Meta Learning Agents . . . . . . . . . . . . . . . . . . . 2479.3 Meta-Learning Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2619.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2679.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26810 Further Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27110.1 Developments in Deep Reinforcement Learning . . . . . . . . . . . . . . . . . 27110.2 Main Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27410.3 The Future of Articial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279A Deep Reinforcement Learning Suites . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283A.1 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284A.2 Agent Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285A.3 Deep Learning Suites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286B Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287B.1 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287B.2 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294B.3 Datasets and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311CONTENTS ixC Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323C.1 Sets and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323C.2 Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326C.3 Derivative of an Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334C.4 Bellman Equations . . . . . . . . . . . . . .

AUTORE
Aske Plaat is a Professor of Data Science at Leiden University and scientific director of the Leiden Institute of Advanced Computer Science (LIACS). He is co-founder of the Leiden Centre of Data Science (LCDS) and initiated SAILS, a multidisciplinary program on artificial intelligence. His research interests include reinforcement learning, combinatorial games and self-learning systems. He is the author of Learning to Play (published by Springer in 2020), which specifically covers reinforcement learning and games.

ALTRE INFORMAZIONI
  • Condizione: Nuovo
  • ISBN: 9789811906374
  • Dimensioni: 235 x 155 mm Ø 646 gr
  • Formato: Brossura
  • Illustration Notes: XV, 406 p. 1 illus.
  • Pagine Arabe: 406
  • Pagine Romane: xv