My Account Log in

1 option

Handbook of reinforcement learning and control / Kyriakos G. Vamvoudakis [and three others], editors.

Springer Nature - Springer Intelligent Technologies and Robotics eBooks 2021 English International Available online

View online
Format:
Book
Contributor:
Vamvoudakis, Kyriakos G., editor.
Series:
Studies in systems, decision and control ; Volume 325.
Studies in Systems, Decision and Control ; Volume 325
Language:
English
Subjects (All):
Reinforcement learning.
Automatic control--Sensitivity.
Automatic control.
Physical Description:
1 online resource (839 pages)
Place of Publication:
Cham, Switzerland : Springer, [2021]
Summary:
This handbook presents state-of-the-art research in reinforcement learning, focusing on its applications in the control and game theory of dynamic systems and future directions for related research and technology. The contributions gathered in this book deal with challenges faced when using learning and adaptation methods to solve academic and industrial problems, such as optimization in dynamic environments with single and multiple agents, convergence and performance analysis, and online implementation. They explore means by which these difficulties can be solved, and cover a wide range of related topics including: deep learning; artificial intelligence; applications of game theory; mixed modality learning; and multi-agent reinforcement learning. Practicing engineers and scholars in the field of machine learning, game theory, and autonomous control will find the Handbook of Reinforcement Learning and Control to be thought-provoking, instructive and informative.
Contents:
Intro
Preface
Contents
Part ITheory of Reinforcement Learning for Model-Free and Model-Based Control and Games
1 What May Lie Ahead in Reinforcement Learning
References
2 Reinforcement Learning for Distributed Control and Multi-player Games
2.1 Introduction
2.2 Optimal Control of Continuous-Time Systems
2.2.1 IRL with Experience Replay Learning Technique ch2Modares2014Automatica,ch2Kamalapurkar2016
2.2.2 mathcalHinfty Control of CT Systems
2.3 Nash Games
2.4 Graphical Games
2.4.1 Off-Policy RL for Graphical Games
2.5 Output Synchronization of Multi-agent Systems
2.6 Conclusion and Open Research Directions
3 From Reinforcement Learning to Optimal Control: A Unified Framework for Sequential Decisions
3.1 Introduction
3.2 The Communities of Sequential Decisions
3.3 Stochastic Optimal Control Versus Reinforcement Learning
3.3.1 Stochastic Control
3.3.2 Reinforcement Learning
3.3.3 A Critique of the MDP Modeling Framework
3.3.4 Bridging Optimal Control and Reinforcement Learning
3.4 The Universal Modeling Framework
3.4.1 Dimensions of a Sequential Decision Model
3.4.2 State Variables
3.4.3 Objective Functions
3.4.4 Notes
3.5 Energy Storage Illustration
3.5.1 A Basic Energy Storage Problem
3.5.2 With a Time-Series Price Model
3.5.3 With Passive Learning
3.5.4 With Active Learning
3.5.5 With Rolling Forecasts
3.5.6 Remarks
3.6 Designing Policies
3.6.1 Policy Search
3.6.2 Lookahead Approximations
3.6.3 Hybrid Policies
3.6.4 Remarks
3.6.5 Stochastic Control, Reinforcement Learning, and the Four Classes of Policies
3.7 Policies for Energy Storage
3.8 Extension to Multi-agent Systems
3.9 Observations
4 Fundamental Design Principles for Reinforcement Learning Algorithms
4.1 Introduction.
4.1.1 Stochastic Approximation and Reinforcement Learning
4.1.2 Sample Complexity Bounds
4.1.3 What Will You Find in This Chapter?
4.1.4 Literature Survey
4.2 Stochastic Approximation: New and Old Tricks
4.2.1 What is Stochastic Approximation?
4.2.2 Stochastic Approximation and Learning
4.2.3 Stability and Convergence
4.2.4 Zap-Stochastic Approximation
4.2.5 Rates of Convergence
4.2.6 Optimal Convergence Rate
4.2.7 TD and LSTD Algorithms
4.3 Zap Q-Learning: Fastest Convergent Q-Learning
4.3.1 Markov Decision Processes
4.3.2 Value Functions and the Bellman Equation
4.3.3 Q-Learning
4.3.4 Tabular Q-Learning
4.3.5 Convergence and Rate of Convergence
4.3.6 Zap Q-Learning
4.4 Numerical Results
4.4.1 Finite State-Action MDP
4.4.2 Optimal Stopping in Finance
4.5 Zap-Q with Nonlinear Function Approximation
4.5.1 Choosing the Eligibility Vectors
4.5.2 Theory and Challenges
4.5.3 Regularized Zap-Q
4.6 Conclusions and Future Work
5 Mixed Density Methods for Approximate Dynamic Programming
5.1 Introduction
5.2 Unconstrained Affine-Quadratic Regulator
5.3 Regional Model-Based Reinforcement Learning
5.3.1 Preliminaries
5.3.2 Regional Value Function Approximation
5.3.3 Bellman Error
5.3.4 Actor and Critic Update Laws
5.3.5 Stability Analysis
5.3.6 Summary
5.4 Local (State-Following) Model-Based Reinforcement Learning
5.4.1 StaF Kernel Functions
5.4.2 Local Value Function Approximation
5.4.3 Actor and Critic Update Laws
5.4.4 Analysis
5.4.5 Stability Analysis
5.4.6 Summary
5.5 Combining Regional and Local State-Following Approximations
5.6 Reinforcement Learning with Sparse Bellman Error Extrapolation
5.7 Conclusion
6 Model-Free Linear Quadratic Regulator.
6.1 Introduction to a Model-Free LQR Problem
6.2 A Gradient-Based Random Search Method
6.3 Main Results
6.4 Proof Sketch
6.4.1 Controlling the Bias
6.4.2 Correlation of "0362 f(K) and f(K)
6.5 An Example
6.6 Thoughts and Outlook
Part IIConstraint-Driven and Verified RL
7 Adaptive Dynamic Programming in the Hamiltonian-Driven Framework
7.1 Introduction
7.1.1 Literature Review
7.1.2 Motivation
7.1.3 Structure
7.2 Problem Statement
7.3 Hamiltonian-Driven Framework
7.3.1 Policy Evaluation
7.3.2 Policy Comparison
7.3.3 Policy Improvement
7.4 Discussions on the Hamiltonian-Driven ADP
7.4.1 Implementation with Critic-Only Structure
7.4.2 Connection to Temporal Difference Learning
7.4.3 Connection to Value Gradient Learning
7.5 Simulation Study
7.6 Conclusion
8 Reinforcement Learning for Optimal Adaptive Control of Time Delay Systems
8.1 Introduction
8.2 Problem Description
8.3 Extended State Augmentation
8.4 State Feedback Q-Learning Control of Time Delay Systems
8.5 Output Feedback Q-Learning Control of Time Delay Systems
8.6 Simulation Results
8.7 Conclusions
9 Optimal Adaptive Control of Partially Uncertain Linear Continuous-Time Systems with State Delay
9.1 Introduction
9.2 Problem Statement
9.3 Linear Quadratic Regulator Design
9.3.1 Periodic Sampled Feedback
9.3.2 Event Sampled Feedback
9.4 Optimal Adaptive Control
9.4.1 Periodic Sampled Feedback
9.4.2 Event Sampled Feedback
9.4.3 Hybrid Reinforcement Learning Scheme
9.5 Perspectives on Controller Design with Image Feedback
9.6 Simulation Results
9.6.1 Linear Quadratic Regulator with Known Internal Dynamics
9.6.2 Optimal Adaptive Control with Unknown Drift Dynamics
9.7 Conclusion
References.
10 Dissipativity-Based Verification for Autonomous Systems in Adversarial Environments
10.1 Introduction
10.1.1 Related Work
10.1.2 Contributions
10.1.3 Structure
10.1.4 Notation
10.2 Problem Formulation
10.2.1 (Q,S,R)-Dissipative and L2-Gain Stable Systems
10.3 Learning-Based Distributed Cascade Interconnection
10.4 Learning-Based L2-Gain Composition
10.4.1 Q-Learning for L2-Gain Verification
10.4.2 L2-Gain Model-Free Composition
10.5 Learning-Based Lossless Composition
10.6 Discussion
10.7 Conclusion and Future Work
11 Reinforcement Learning-Based Model Reduction for Partial Differential Equations: Application to the Burgers Equation
11.1 Introduction
11.2 Basic Notation and Definitions
11.3 RL-Based Model Reduction of PDEs
11.3.1 Reduced-Order PDE Approximation
11.3.2 Proper Orthogonal Decomposition for ROMs
11.3.3 Closure Models for ROM Stabilization
11.3.4 Main Result: RL-Based Closure Model
11.4 Extremum Seeking Based Closure Model Auto-Tuning
11.5 The Case of the Burgers Equation
11.6 Conclusion
Part IIIMulti-agent Systems and RL
12 Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
12.1 Introduction
12.2 Background
12.2.1 Single-Agent RL
12.2.2 Multi-Agent RL Framework
12.3 Challenges in MARL Theory
12.3.1 Non-unique Learning Goals
12.3.2 Non-stationarity
12.3.3 Scalability Issue
12.3.4 Various Information Structures
12.4 MARL Algorithms with Theory
12.4.1 Cooperative Setting
12.4.2 Competitive Setting
12.4.3 Mixed Setting
12.5 Application Highlights
12.5.1 Cooperative Setting
12.5.2 Competitive Setting
12.5.3 Mixed Settings
12.6 Conclusions and Future Directions
13 Computational Intelligence in Uncertainty Quantification for Learning Control and Differential Games
13.1 Introduction
13.2 Problem Formulation of Optimal Control for Uncertain Systems
13.2.1 Optimal Control for Systems with Parameters Modulated by Multi-dimensional Uncertainties
13.2.2 Optimal Control for Random Switching Systems
13.3 Effective Uncertainty Evaluation Methods
13.3.1 Problem Formulation
13.3.2 The MPCM
13.3.3 The MPCM-OFFD
13.4 Optimal Control Solutions for Systems with Parameter Modulated by Multi-dimensional Uncertainties
13.4.1 Reinforcement Learning-Based Stochastic Optimal Control
13.4.2 Q-Learning-Based Stochastic Optimal Control
13.5 Optimal Control Solutions for Random Switching Systems
13.5.1 Optimal Controller for Random Switching Systems
13.5.2 Effective Estimator for Random Switching Systems
13.6 Differential Games for Systems with Parameters Modulated by Multi-dimensional Uncertainties
13.6.1 Stochastic Two-Player Zero-Sum Game
13.6.2 Multi-player Nonzero-Sum Game
13.7 Applications
13.7.1 Traffic Flow Management Under Uncertain Weather
13.7.2 Learning Control for Aerial Communication Using Directional Antennas (ACDA) Systems
13.8 Summary
14 A Top-Down Approach to Attain Decentralized Multi-agents
14.1 Introduction
14.2 Background
14.2.1 Reinforcement Learning
14.2.2 Multi-agent Reinforcement Learning
14.3 Centralized Learning, But Decentralized Execution
14.3.1 A Bottom-Up Approach
14.3.2 A Top-Down Approach
14.4 Centralized Expert Supervises Multi-agents
14.4.1 Imitation Learning
14.4.2 CESMA
14.5 Experiments
14.5.1 Decentralization Can Achieve Centralized Optimality
14.5.2 Expert Trajectories Versus Multi-agent Trajectories
14.6 Conclusion
15 Modeling and Mitigating Link-Flooding Distributed Denial-of-Service Attacks via Learning in Stackelberg Games.
Notes:
Description based on print version record.
ISBN:
3-030-60990-1
OCLC:
1257705186

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account