libri scuola books Fumetti ebook dvd top ten sconti 0 Carrello


Torna Indietro

plaat aske - deep reinforcement learning

Deep Reinforcement Learning




Disponibilità: Normalmente disponibile in 15 giorni


PREZZO
54,98 €
NICEPRICE
52,23 €
SCONTO
5%



Questo prodotto usufruisce delle SPEDIZIONI GRATIS
selezionando l'opzione Corriere Veloce in fase di ordine.


Pagabile anche con Carta della cultura giovani e del merito, 18App Bonus Cultura e Carta del Docente


Facebook Twitter Aggiungi commento


Spese Gratis

Dettagli

Genere:Libro
Lingua: Inglese
Editore:

Springer

Pubblicazione: 06/2022
Edizione: 1st ed. 2022





Trama

Deep reinforcement learning has attracted considerable attention recently. Impressive results have been achieved in such diverse fields as autonomous driving, game playing, molecular recombination, and robotics. In all these fields, computer programs have taught themselves to understand problems that were previously considered to be very difficult. In the game of Go, the program AlphaGo has even learned to outmatch three of the world’s leading players.Deep reinforcement learning takes its inspiration from the fields of biology and psychology. Biology has inspired the creation of artificial neural networks and deep learning, while psychology studies how animals and humans learn, and how subjects’ desired behavior can be reinforced with positive and negative stimuli. When we see how reinforcement learning teaches a simulated robot to walk, we are reminded of how children learn, through playful exploration. Techniques that are inspired by biology and psychology work amazingly well in computers: animal behavior and the structure of the brain as new blueprints for science and engineering. In fact, computers truly seem to possess aspects of human behavior; as such, this field goes to the heart of the dream of artificial intelligence.

These research advances have not gone unnoticed by educators. Many universities have begun offering courses on the subject of deep reinforcement learning. The aim of this book is to provide an overview of the field, at the proper level of detail for a graduate course in artificial intelligence. It covers the complete field, from the basic algorithms of Deep Q-learning, to advanced topics such as multi-agent reinforcement learning and meta learning.





Sommario

Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 What is Deep Reinforcement Learning? . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Three Machine Learning Paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Overview of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Tabular Value-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1 Sequential Decision Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Tabular Value-Based Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 Classic Gym Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3 Approximating the Value Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.1 Large, High-Dimensional, Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.2 Deep Value-Based Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.3 Atari 2600 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4 Policy-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.1 Continuous Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2 Policy-Based Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.3 Locomotion and Visuo-Motor Environments . . . . . . . . . . . . . . . . . . . . 111
4.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5 Model-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.1 Dynamics Models of High-Dimensional Problems . . . . . . . . . . . . . . . 122
5.2 Learning and Planning Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.3 High-dimensional Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
vii
viii CONTENTS
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6 Two-Agent Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.1 Two-Agent Zero-Sum Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.2 Tabula Rasa Self-Play Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.3 Self-Play Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
7 Multi-Agent Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
7.1 Multi-Agent Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.2 Multi-Agent Reinforcement Learning Agents . . . . . . . . . . . . . . . . . . . . 202
7.3 Multi-Agent Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
7.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
8 Hierarchical Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
8.1 Granularity of the Structure of Problems . . . . . . . . . . . . . . . . . . . . . . . 227
8.2 Divide and Conquer for Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
8.3 Hierarchical Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
8.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9 Meta Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
9.1 Learning to Learn Related Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
9.2 Transfer Learning and Meta Learning Agents . . . . . . . . . . . . . . . . . . . 247
9.3 Meta-Learning Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
9.4 Summary and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
10 Further Developments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
10.1 Developments in Deep Reinforcement Learning . . . . . . . . . . . . . . . . . 271
10.2 Main Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
10.3 The Future of Articial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
A Deep Reinforcement Learning Suites . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
A.1 Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
A.2 Agent Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
A.3 Deep Learning Suites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
B Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
B.1 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
B.2 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
B.3 Datasets and Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
CONTENTS ix
C Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
C.1 Sets and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
C.2 Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
C.3 Derivative of an Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
C.4 Bellman Equations . . . . . . . . . . . . . .




Autore

Aske Plaat is a Professor of Data Science at Leiden University and scientific director of the Leiden Institute of Advanced Computer Science (LIACS). He is co-founder of the Leiden Centre of Data Science (LCDS) and initiated SAILS, a multidisciplinary program on artificial intelligence. His research interests include reinforcement learning, combinatorial games and self-learning systems. He is the author of Learning to Play (published by Springer in 2020), which specifically covers reinforcement learning and games.










Altre Informazioni

ISBN:

9789811906374

Condizione: Nuovo
Dimensioni: 235 x 155 mm Ø 646 gr
Formato: Brossura
Illustration Notes:XV, 406 p. 1 illus.
Pagine Arabe: 406
Pagine Romane: xv


Dicono di noi