Reinforcement learning for optimal feedback control : a Lyapunov-based approach /

Reinforcement Learning for Optimal Feedback Control develops model-based and data-driven reinforcement learning methods for solving optimal control problems in nonlinear deterministic dynamical systems. In order to achieve learning under uncertainty, data-driven methods for identifying system models...

Full description

Saved in:
Bibliographic Details
Main Authors: Kamalapurkar, Rushikesh (Author)
Group Author: Walters, Patrick; Rosenfeld, Joel; Dixon, Warren E., 1972-
Published: Springer,
Publisher Address: Cham, Switzerland :
Publication Dates: [2018]
Literature type: Book
Language: English
Series: Communications and control engineering. 0178-5354
Subjects:
Summary: Reinforcement Learning for Optimal Feedback Control develops model-based and data-driven reinforcement learning methods for solving optimal control problems in nonlinear deterministic dynamical systems. In order to achieve learning under uncertainty, data-driven methods for identifying system models in real-time are also developed. The book illustrates the advantages gained from the use of a model and the use of previous experience in the form of recorded data through simulations and experiments. The book's focus on deterministic systems allows for an in-depth Lyapunov-based analysis of the performance of the methods described during the learning phase and during execution. To yield an approximate optimal controller, the authors focus on theories and methods that fall under the umbrella of actor-critic methods for machine learning. They concentrate on establishing stability during the learning phase and the execution phase, and adaptive model-based and data-driven reinforcement learning, to assist readers in the learning process, which typically relies on instantaneous input-output measurements. This monograph provides academic researchers with backgrounds in diverse disciplines from aerospace engineering to computer science, who are interested in optimal reinforcement learning functional analysis and functional approximation theory, with a good introduction to the use of model-based methods. The thorough treatment of an advanced treatment to control will also interest practitioners working in the chemical-process and power-supply industry.
Carrier Form: xvi, 293 pages : color illustrations ; 25 cm.
Bibliography: Includes bibliographical references and index.
ISBN: 9783319783833
3319783831
Index Number: Q325
CLC: TP13
TP181
Call Number: TP181/K151
Contents: Intro; Preface; Contents; Symbols; 1 Optimal Control; 1.1 Introduction; 1.2 Notation; 1.3 The Bolza Problem; 1.4 Dynamic Programming; 1.4.1 Necessary Conditions for Optimality; 1.4.2 Sufficient Conditions for Optimality; 1.5 The Unconstrained Affine-Quadratic Regulator; 1.6 Input Constraints; 1.7 Connections with Pontryagin's Maximum Principle; 1.8 Further Reading; 1.8.1 Numerical Methods; 1.8.2 Differential Games and Equilibrium Solutions; 1.8.3 Viscosity Solutions and State Constraints; References; 2 Approximate Dynamic Programming; 2.1 Introduction
2.2 Exact Dynamic Programming in Continuous Time and Space; 2.2.1 Exact Policy Iteration: Differential and Integral Methods; 2.2.2 Value Iteration and Associated Challenges; 2.3 Approximate Dynamic Programming in Continuous Time and Space; 2.3.1 Some Remarks on Function Approximation; 2.3.2 Approximate Policy Iteration; 2.3.3 Development of Actor-Critic Methods; 2.3.4 Actor-Critic Methods in Continuous Time and Space; 2.4 Optimal Control and Lyapunov Stability; 2.5 Differential Online Approximate Optimal Control; 2.5.1 Reinforcement Learning-Based Online Implementation
2.5.2 Linear-in-the-Parameters Approximation of the Value Function; 2.6 Uncertainties in System Dynamics; 2.7 Persistence of Excitation and Parameter Convergence; 2.8 Further Reading and Historical Remarks; References; 3 Excitation-Based Online Approximate Optimal Control; 3.1 Introduction; 3.2 Online Optimal Regulation; 3.2.1 Identifier Design; 3.2.2 Least-Squares Update for the Critic; 3.2.3 Gradient Update for the Actor; 3.2.4 Convergence and Stability Analysis; 3.2.5 Simulation; 3.3 Extension to Trajectory Tracking; 3.3.1 Formulation of a Time-Invariant Optimal Control Problem
3.3.2 Approximate Optimal Solution; 3.3.3 Stability Analysis; 3.3.4 Simulation; 3.4 N-Player Nonzero-Sum Differential Games; 3.4.1 Problem Formulation; 3.4.2 Hamilton-Jacobi Approximation Via Actor-Critic-Identifier; 3.4.3 System Identifier; 3.4.4 Actor-Critic Design; 3.4.5 Stability Analysis; 3.4.6 Simulations; 3.5 Background and Further Reading; References; 4 Model-Based Reinforcement Learning for Approximate Optimal Control; 4.1 Introduction; 4.2 Model-Based Reinforcement Learning; 4.3 Online Approximate Regulation; 4.3.1 System Identification; 4.3.2 Value Function Approximation
4.3.3 Simulation of Experience Via Bellman Error Extrapolation; 4.3.4 Stability Analysis; 4.3.5 Simulation; 4.4 Extension to Trajectory Tracking; 4.4.1 Problem Formulation and Exact Solution; 4.4.2 Bellman Error; 4.4.3 System Identification; 4.4.4 Value Function Approximation; 4.4.5 Simulation of Experience; 4.4.6 Stability Analysis; 4.4.7 Simulation; 4.5 N-Player Nonzero-Sum Differential Games; 4.5.1 System Identification; 4.5.2 Model-Based Reinforcement Learning; 4.5.3 Stability Analysis; 4.5.4 Simulation; 4.6 Background and Further Reading; References; 5 Differential Graphical Games