# Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

@article{AlTamimi2008DiscreteTimeNH, title={Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof}, author={Asma Al-Tamimi and Frank L. Lewis and Murad Abu-Khalaf}, journal={IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)}, year={2008}, volume={38}, pages={943-949} }

Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural… Expand

#### Figures and Topics from this paper

#### 732 Citations

Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

- Computer Science, Medicine
- IEEE Transactions on Neural Networks and Learning Systems
- 2014

It is shown that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation and it is proven that any of the iteratives control laws can stabilize the nonlinear systems. Expand

On-policy Approximate Dynamic Programming for Optimal Control of non-linear systems

- Computer Science
- 2020 7th International Conference on Control, Decision and Information Technologies (CoDIT)
- 2020

The paper employs the approximate dynamic programming method to solve the HJB equation for the deterministic nonlinear discrete-time systems in continuous state and action space and implements the policy iteration algorithm which has the framework of actor-critic architecture. Expand

Approximate Dynamic Programming with Gaussian Processes for Optimal Control of Continuous-Time Nonlinear Systems

- Computer Science
- 2020

A new algorithm for realization of approximate dynamic programming (ADP) with Gaussian processes (GPs) for continuous-time (CT) nonlinear input-affine systems is proposed to infinite horizon optimal control problems. Expand

Data-based approximate policy iteration for nonlinear continuous-time optimal control design

- Computer Science, Mathematics
- ArXiv
- 2013

A model-free policy iteration algorithm is derived for constrained optimal control problem and its convergence is proved, which can learn the solution of HJB equation and optimal control policy without requiring any knowledge of system mathematical model. Expand

Data-Driven Finite-Horizon Approximate Optimal Control for Discrete-Time Nonlinear Systems Using Iterative HDP Approach

- Computer Science
- IEEE Transactions on Cybernetics
- 2018

A data-based finite-horizon optimal control approach for discrete-time nonlinear affine systems and the convergence of the iterative ADP algorithm and the stability of the weight estimation errors based on the HDP structure are intensively analyzed. Expand

Policy Iteration for Optimal Control of Discrete-Time Nonlinear Systems

- Computer Science
- 2017

It is shown that the iterative value function is nonincreasingly convergent to the optimal solution of the Bellman equation, and it is proven that any of the Iterative control laws can stabilize the nonlinear system. Expand

Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence

- Computer Science, Mathematics
- Neural Networks
- 2009

The need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training. Expand

Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design

- Computer Science
- Autom.
- 2014

This paper addresses the model-free nonlinear optimal control problem based on data by introducing the reinforcement learning (RL) technique by using a data-based approximate policy iteration (API) method by using real system data rather than a system model. Expand

Value Iteration ADP for Discrete-Time Nonlinear Systems

- Computer Science
- 2017

An iterative \(\theta \)-ADP algorithm is developed to solve the optimal control problem of infinite-horizon discrete-time nonlinear systems, which shows that each of the iterative controls can stabilize the nonlinear system and the condition of initial admissible control is avoided effectively. Expand

Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update

- Mathematics, Medicine
- IEEE Transactions on Neural Networks and Learning Systems
- 2012

The Hamilton-Jacobi-Bellman equation is solved forward-in-time for the optimal control of a class of general affine nonlinear discrete-time systems without using value and policy iterations and the end result is the systematic design of an optimal controller with guaranteed convergence that is suitable for hardware implementation. Expand

#### References

SHOWING 1-10 OF 66 REFERENCES

Neural Network -based Nearly Optimal Hamilton-Jacobi-Bellman Solution for Affine Nonlinear Discrete-Time Systems

- Mathematics
- Proceedings of the 44th IEEE Conference on Decision and Control
- 2005

In this paper, we consider the use of nonlinear networks towards obtaining nearly optimal solutions to the control of nonlinear discrete-time systems. The method is based on least-squares successive… Expand

Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach

- Mathematics, Computer Science
- Autom.
- 2005

It is shown that the constrained optimal control law has the largest region of asymptotic stability (RAS) and the result is a nearly optimal constrained state feedback controller that has been tuned a priori off-line. Expand

Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control

- Mathematics, Computer Science
- Autom.
- 2007

It is proven that the algorithm ends up to be a model-free iterative algorithm to solve the (GARE) of the linear quadratic discrete-time zero-sum game. Expand

Model-free Q-learning designs for discrete-time zero-sum games with application to H-infinity control

- 2007 European Control Conference (ECC)
- 2007

In this paper, the optimal strategies for discrete-time linear system quadratic zero-sum games related to the H-infinity optimal control problem are solved in forward time without knowing the system… Expand

An algorithm to solve the discrete HJI equation arising in the L2 gain optimization problem

- Mathematics
- 1999

A synthesis of the discrete non-linear H control law boils down to the solution of a set of algebraic and partial differential equations known as the discrete Hamilton-Jacobi-Isaacs (DHJI) equation,… Expand

H∞-control of discrete-time nonlinear systems

- Computer Science, Mathematics
- IEEE Trans. Autom. Control.
- 1996

This paper presents an explicit solution to the problem of disturbance attenuation with internal stability via full information feedback, state feedback, and dynamic output feedback, respectively,… Expand

Adaptive dynamic programming

- Mathematics, Computer Science
- IEEE Trans. Syst. Man Cybern. Part C
- 2002

An adaptive dynamic programming algorithm (ADPA) is described which fuses soft computing techniques to learn the optimal cost functional for a stabilizable nonlinear system with unknown dynamics and hard Computing techniques to verify the stability and convergence of the algorithm. Expand

Hamilton-Jacobi-Isaacs formulation for constrained input nonlinear systems

- Mathematics
- 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601)
- 2004

In this paper, we consider the H/sub /spl infin// nonlinear state feedback control of constrained input systems. The input constraints are encoded via a quasi-norm that enables applying quasi L/sub… Expand

Adaptive Critic Designs for Discrete-Time Zero-Sum Games With Application to $H_{\infty}$ Control

- Mathematics, Computer Science
- IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
- 2007

In this correspondence, adaptive critic approximate dynamic programming designs are derived to solve the discrete-time zero-sum game in which the state and action spaces are continuous. This results… Expand

Adaptive linear quadratic control using policy iteration

- Computer Science
- Proceedings of 1994 American Control Conference - ACC '94
- 1994

The stability and convergence results for dynamic programming-based reinforcement learning applied to linear quadratic regulation (LQR) are presented and the specific algorithm is based on Q-learning and it is proven to converge to an optimal controller. Expand