导语
内容提要
本书系统地介绍了基于自适应学习控制的单控制器与多控制器系统最优控制方法。分别针对单控制器系统与多控制器系统,提出了一系列自适应动态规划方法获得系统的最优控制策略。全书共分为13章,其中第1、2章主要讲述自适应动态规划方法的基本原理以及实现方式。第3~7章主要介绍单控制器非线性系统迭代自适应动态规划最优控制方法以及性能分析方案。第9~13章主要介绍多控制器系统迭代自适应动态规划最优控制方法以及性能分析方案。
目录
1 Introduction
1.1 Optimal Control
1.1.1 Continuous-Time LQR
1.1.2 Discrete-Time LQR
1.2 Adaptive Dynamic Programming
1.3 Review of Matrix Algebra
References
2 Neural-Network-BasedApproach for Finite-TimeOptimal Control
2.1 Introduction
2.2 Problem Formulation and Motivation
2.3 The Data-Based Identifier
2.4 Derivation of the Iterative ADP Algorithm with Convergence Analysis
2.5 Neural Network Implementation of theIterative Control Algorithm
2.6 Simulation Study
2.7 Conclusion
References
3 Nearly Finite-HorizonOptimalControlfor Nonafiine Time-Delay Nonlinear Systems
3.1 Introduction
3.2 Problem Statement
3.3 The Iteration ADP Algorithm and ItsConvergence
3.3.1 The Novel ADP Iteration Algorithm
3.3.2 Convergence Analysis of the Improved Iteration Algorithm
3.3.3 Neural Network Implementation of the Iteration ADP Algorithm
3.4 Simulation Study
3.5 Conclusion
References
4 Multi-objective Optimal Control for Time-Delay Systems
4.1 Introduction
4.2 Problem Formulation
4.3 Derivation of the ADP Algorithm for Time-Delay Systems
4.4 Neural Network Implementation for the Multi-objective Optimal Control Problem of Time-Delay Systems
4.5 Simulation Study
4.6 Conclusion
References
5 Multiple Actor-Critic Optimal Control via ADP
5.1 Introduction
5.2 Problem Statement
5.3 SIANN Architecture-Based Classification
5.4 Optimal Control Based on ADP
5.4.1 Model Neural Network
5.4.2 Critic Network and Action Network
5.5 Simulation Study
5.6 Conclusion
References
6 Optimal Control for a Class of Complex-Valued Nonlinear Systems
6.1 Introduction
6.2 Motivations and Preliminaries
6.3 ADP-Based Optimal Control Design
6.3.1 Critic Network
6.3.2 Action Network
6.3.3 Design of the Compensation Controller
6.3.4 Stability Analysis
6.4 Simulation Study
6.5 Conclusion
References
7 Off-Policy Neuro-Optimal Control for Unknown Complex-Valued Nonlinear Systems
7.1 Introduction
7.2 Problem Statement
7.3 Off-Policy Optimal Control Method
7.3.1 Convergence Analysis of Off-Policy PI Algorithm
7.3.2 Implementation Method of Off-Policy Iteration Algorithm
7.3.3 Implementation Process
7.4 Simulation Study
7.5 Conclusion
References
8 Approximation-Error-ADP-Based Optimal Tracking Control for Chaotic Systems
8.1 Introduction
8.2 Problem Formulation and Preliminaries
8.3 Optimal Tracking Control Scheme Basedon Approximation-Error ADP Algorithm
8.3.1 Description of Approximation-Error ADP Algorithm
8.3.2 Convergence Analysis of the Iterative ADP Algorithm
8.4 Simulation Study
8.5 Conclusion
References
9 Off-Policy Actor-Critic Structure for Optimal Controlof Unknown Systems with Disturbances
9.1 Introduction
9.2 Problem Statement
9.3 Off-Policy Actor-Critic Integral Reinforcement Learning
9.3.1 On-Policy IRL for Nonzero Disturbance
9.3.2 Off-Policy IRL for Nonzero Disturbance
9.3.3 NN Approximation for Actor-Critic Structure
9.4 Disturbance Compensation Redesign andStability Analysis
9.4.1 Disturbance Compensation Off-Policy Controller Design
9.4.2 Stability Analysis
9.5 Simulation Study
9.6 Conclusion
References
10 An Iterative ADP Method to Solve for a Class of Nonlinear Zero-Sum DifferentialGames
10.1 Introduction
10.2 Preliminaries and Assumptions
10.3 Iterative Approximate Dynamic Programming Method for ZS Differential Games
10.3.1 Derivation of the Iterative ADP Method
10.3.2 The Procedure of theMethod
10.3.3 The Properties of theIterativeADP Method
10.4 Neural Network Implementation
10.4.1 The Model Network
10.4.2 The Critic Network
10.4.3 The Action Network
10.5 Simulation Study
10.6 Conclusion
References
11 Neural-Network-Based Synchronous Iteration Learning Method for Multi-player Zero-Sum Games
11.1 Introduction
11.2 Motivations and Preliminaries
11.3 Synchronous Solution of Multi-playerZSGames
11.3.1 Derivation of Off-Policy Algorithm
11.3.2 Implementation Method for Off-Policy Algorithm
11.3.3 Stability Analysis
11.4 Simulation Study
11.5 Conclusion
References
12 Off-Policy Integral Reinforcement Learning Method for Multi-player Non-Zero-Sum Games
12.1 Introduction
12.2 Problem Statement
12.3 Multi-player Learning PI SolutionforNZSGames
12.4 Off-Policy Integral ReinforcementLearningMethod
12.4.1 Derivation of Off-Policy Algorithm
12.4.2 Implementation Method for Off-Policy Algorith
12.4.3 Stability Analysis
12.5 Simulation Study
12.6 Conclusion
References
13 Optimal Distributed Synchronization Control for Heterogeneous Multi-agent Graphical Games
13.1 Introduction
13.2 Graphs and Synchronization of Multi-agent Systems
13.2.1 Graph Theory
13.2.2 Synchronization and Tracking Error Dynamic Systems
13.3 Optimal Distributed Cooperative Control for Multi-agent Differential Graphical Games
13.3.1 Cooperative Performance Index Function
13.3.2 Nash Equilibrium
13.4 Heterogeneous Multi-agent Differential Graphical Games by Iterative ADP Algorithm
13.4.1 Derivation of the Heterogeneous Multi-agent Differential Graphical Games
13.4.2 Properties of the Developed Policy Iteration Algorithm
13.4.3 Heterogeneous Multi-agent Policy Iteration Algorithm
13.5 Simulation Study
13.6 Conclusion
References