Computation of Reachable Sets Based on Hamilton-Jacobi-Bellman Equation with Running Cost Function

A novel method for computing reachable sets is proposed in this paper. In the proposed method, a Hamilton-Jacobi-Bellman equation with running cost functionis numerically solved and the reachable sets of different time horizons are characterized by a family of non-zero level sets of the solution of the Hamilton-Jacobi-Bellman equation. In addition to the classical reachable set, by setting different running cost functions and terminal conditionsof the Hamilton-Jacobi-Bellman equation, the proposed method allows to compute more generalized reachable sets, which are referred to as cost-limited reachable sets. In order to overcome the difficulty of solving the Hamilton-Jacobi-Bellman equation caused by the discontinuity of the solution, a method based on recursion and grid interpolation is employed. At the end of this paper, some examples are taken to illustrate the validity and generality of the proposed method.


Introduction
The methods to analyze linear systems have been reasonably mature, some crucial characteristics such as stability, controllability and observability have been systematically and strictly defined. Also, the analyses on these characteristics have become the common steps in solving many engineering problems. However, for nonlinear systems, these characteristics are quite difficult to be defined and analyzed [1], such that for some nonlinear systems, their behaviors are difficult to predict.
Reachability analysis is an effective method to study the behavior of nonlinear control systems. By applying reachability analysis, one can solve a variety of engineering problems, especially those involving system safety [2,3,4], feasibility [5], and control law design [6,7]. In reachability analysis, one specifies a set in the state space as the target set and then aims to find a set of initial states of the trajectories that can reach the target set within a given time horizon [8,9]. Such a set is referred to as the reachable set.
However, finding reachable sets is a challenging task, which involves various aspects such as computation and data storage. The most intuitive approach is to verify each point in the state space one by one, however, this approach often consumes a lot of time due to the diversity of system states and control inputs [10,11]. Therefore, formal verification methods are needed.
Unfortunately, it is nearly impossible to compute reachable sets using a analytical approach, except for some special problems [12]. In recent years, various numerical methods have been proposed, which can be divided into two categories: the Lagrangian methods [13,14,15,16] and the methods based on state space discretization [17,18,19,10,2,3]. The former can solve the reachability problems of highdimensional systems, but has high requirements on the form of the control system, and thus is mainly used to solve linear problems. The latter has less requirements on the form of the control system and can therefore be used for nonlinear systems. It is this universality that makes the methods based on state space discretization more widely used in engineering, and the level set method [8,10,2,3] is a representative of them.
In the level set method, a Hamilton-Jacobi-Bellman (HJB) equation without running cost function is constructed, and the terminal condition of this equation is set to the signed distance function of the target set. Then the state space is discretized into a Cartesian grid structure and the HJB equation is numerically solved. During the computation, the values of the equation's solutions at the grid points are stored in an array that has the same dimensions as the state space. Finally, the reachable set is characterized as the zero-level set of the solution.
The principle of the level set method determines that its storage space requirements are quite demanding. To save the reachable set of a given time horizon, one needs to save the solution of the HJB equation at a certain time point, the memory required grow significantly with an increase in the problem's dimension [6,20,21]. To save the reachable sets under different time horizons, one needs to save the solutions of the HJB equation at different time points, which in turn leads to a multiple increase in storage space consumption. In addition, in the level set approach, saving the solutions of the HJB equation at multiple moments is also necessary for designing the control law [6,8,22]. These limitations restrict the development and application of the level set method to some extent.
In order to overcome the above-mentioned limitations, this paper proposes a new method to compute reachable sets. In the proposed method, a HJB equation with running cost function is numerically solved, and the reachable sets under different time horizons can be characterized by different non-zero level sets of the solution of the HJB equation at a certain time point. Such a mechanism can significantly reduce the consumption of storage space and facilitate the design of control law. In addition, more generalized reachability problems can be solved by setting different running cost functions and terminal conditions. In these problems, a performance index can be constructed, which is a combination of a Lagrangian (the time integral of a running cost) and an endpoint cost. The aim is to find a set of initial states of the trajectories that can reach the target set before the performance index increasing to the given admissible cost. In this paper, such a set is referred to as a cost-limited reachable set.
In summary, the main contributions of this paper are as follows: (1) A novel method for computing reachable sets based on the HJB equation with a running cost function is proposed. This method can significantly reduce the storage space consumption and bring convenience to the control law design.
(2) The reachability problem is generalized by setting different running cost functions and terminal conditions, and the definition of cost-limited reachable set is put forward.
(3) To overcome the discontinuity of the solution of the HJB equation, a numerical method based on recursion and grid interpolation is applied to solve the HJB equation.
The structure of this paper is as follows. Section II briefly introduces the reachability problem and the level set method. Section III describes the method to construct the HJB equation of the proposed method and the representation of the reachable set. Section IV generalizes the reachability problem and presents the definition of cost-limited reachable set, and also introduces the method to design control law. A method to solve the HJB equation is proposed in Section V and some numerical examples are given in Section VI. The results are summarized in Section VII.

Reachability problem
Consider a continuous time control system with fully observable state: where s ∈ R n is the system state, u ∈ U is referred to as the control input. The function f (., .) : R n ×U → R n is bounded. Let U denote the set of lebesgue measurable functions from the time interval [0, ∞) to U. Then, given the initial state s t0 at time t 0 , u(.) ∈ U , the evolution of system (1) in time interval [t 0 , t 1 ] can be denoted as a continuous trajectory φ t1 t0 (., s t0 , u(.)) : [t 0 , t 1 ] → R n and φ t1 t0 (t 0 , s t0 , u(.)) = s t0 . Given a target set K and a time horizon T , The definition of reachable set is [14,8]:

Level set method
In level set method, the following HJB equation about V (., .) : R n × R is numerically solved: where l(.) is bounded and Lipschitz continuous, and satisfies K = {s ∈ R n |l(s) ≤ 0}. The solution is approximated on a Cartesian grid of the state space. The reachable set is represented as the zero level set of function V (., 0), i.e.
Based on this principle, several mature toolboxes have been developed [23,24,3] and applied to many practical engineering problems, such as flight control systems [25,26,27,7,28], ground traffic management systems [29,30,31], air traffic management systems [2,3,32], etc. Denote the number of grid points in the ith dimension of the Cartesian grid as N i , then the storage space consumed to save R(K, T ) is proportional to n i=1 N i . It should be noted that, for T 1 , ..., T M ∈ [0, ∞), the expressions of the reachable sets of these time horizons are as follows: Since the value functions V (., T − T 1 ), ..., V (., T − T M ) are each different, the storage space consumption required to save these reachable sets is proportional to M n i=1 N i , see Fig. 1.

Solutions at different time points
Zero-level sets Reachable sets Figure 1: Method of saving reachable sets by the level set method.

Reachability problem and optimal control
Consider a case where the control input u(.) aims to transfer the system state to the target set in the shortest possible time. In this case, if a trajectory can reach the target set in the time horizon T , then the initial state of this trajectory belongs to the reachable set. Therefore, a value function W (.) : R n → R can be constructed as follows: where I(.) : R n → R is a running cost function and I(s) ≡ 1. then the reachable set R(K, T ) can be characterized by the T -level set of W (.), i.e.
Define a modified dynamic system: and a modified running cost function: Given the state s t0 at time t 0 , u(.) ∈ U , the evolution of system (8) in time interval [t 0 , t 1 ] can be denoted asφ t1 t0 (., s t0 , u(.)) : [t 0 , t 1 ] → R n andφ t1 t0 (t 0 , s t0 , u(.)) = s t0 . Based on system (8) and running cost (9), we can also construct another value function: Consequently, Eq. (6) and Eq. (10) have the following equivalence: Proof . The maximum of the value function W (.,T ) is: The trajectories of system (8) correspond to the same trajectories as the evolution of system (1) as long as it evolves outside the target set, once a trajectory of system (8) touches the border of the target set, it stays at the border of the target set and the modifies running cost function is set to 0. Consequently, Therefore, Theorem 1 states that in the region s ∈ R n |W (s,T ) <T , W (.) and W (.,T ) are equal, and for any T ∈ [0,T ), the reachable set R(K, T ) can also be expressed as: In addition, for T 1 , ..., T M ∈ [0,T ), the reachable sets R(K, T 1 ), ..., R(K, T M ) can be represented as different level sets of the value function W (.,T ), and simply save W (.,T ) to save all these reachable sets, i.e.

Construction of HJB equation
Based on system (8) and running cost function (9), an HJB equation with running cost function can be constructed: Proof . WhenT = 0, the following equation holds: From the definition of W (.,T ), this function is a cost-to-go function on the time interval [0,T ]. According to Bellman's principle of optimality [33], for any t ∈ [0, ∞) a small enough ∆t, the cost-to-go function should satisfy the following equation: Substituting Eq. (20) into Eq. (19) yields the following equation:  Fig. 2 illustrates the way to save the reachable sets by the proposed method.
Non-zero level sets Reachable sets

Generalization of reachability problems
In the previous section, the running cost function and the terminal condition of HJB equation are quite special. In fact, the HJB equation can also be used for more general reachability problems by setting different running cost functions and terminal conditions. This section introduces a novel type of reachability problems.

Definition of cost-limited reachable set
A general running cost function is a scalar function of state and control input, denoted as c(., .) : R n × U → R In this section, we assume that: c(s, u) = λ holds, where λ is a positive real number.
The above-mentioned assumption is easily satisfied in engineering practice, such as the fuel consumption and path length per unit time are positive.
The performance index of the evolution of system (1) initialized from s t0 at time t 0 under control input u(.) in time interval [t 0 , t 1 ] is denoted as: where Φ(.) : R n → R is the endpoint cost function. Given a target set K and an admissible cost J, the cost-limited reachable set can be defined: Definition 2 (Cost-limited reachable set).
where "∧" is the logical operator "AND". Informally, under Assumption 1, the performance index increases with the increasing of time, the cost-limited reachable set is a set of initial states of trajectories that can be reach the target set before the performance index increasing to the given admissible cost.
Remark 1. According to Definition 1 and Definition 2, reachable set is a special form of cost-limited reachable set. If c(s, u) ≡ 1 and Φ(s) ≡ 0, then the performance index J t 0 (s 0 , u(.)) = t and the cost-limited reachable set is degenerated into the reachable set, see Fig. 3.

Computation of cost-limited reachable set
Consider a case where the controller aims to transfer the system state to the target set with the least possible cost. If a trajectory can enter the target set before the performance index increasing to the given admissible cost, its initial state must in the cost-limited reachable set. Similar to Eq. (6), a value function can be constructed as follows: The cost-limited reachable set R c (K, J) can be represented as the J-level set of function W c (.), i.e.
Construct a modified running cost function on the basis of Eq. (23): Similar to Eq. (10), based on the modified running cost function (28) and the modified system (8), the following value function can be constructed: Denote min s∈R n Φ(s) = Λ, then Eq. (26) and Eq. (29) have the following equivalence: Proof . The modified running cost function (28) is not less than λ when the trajectory of system (8) evolves outside the target set K. Therefore, Consequently, The proof of Theorem 4 is similar to that of Theorem 2 and will not be repeated here. It follows from Theorem 3 and Theorem 4 that, for J 1 , ..., J M < λT + Λ, the cost-limited reachable sets R c (K, J 1 ), ..., R c (K, J M ) can be represented as different level sets of W c (.,T ) and all these sets can be saved by saving W c (.,T ), i.e.

Non-zero level sets
Cost-limited Reachable sets Figure 4: Method of saving cost-limited reachable sets by the proposed method.

Control law in the level set method
In the level set method, at time t, the optimal control input at state s is [8,6]: For any s 0 ∈ R(K, T ), the trajectory initialized from s 0 can reach the target set K under control law (35) in time T . Since V (s, t) varies with time t, it is required to save V (., .) at each time point t in time interval [0, T ] to implement this control law. This also requires a large amount of storage space.

Control law in the proposed method
The following control law ensures that the trajectory of the system enters the target set at the smallest performance index: According to Theorem 3 and Theorem 4, for any s 0 ∈ s ∈ R n |W c (s,T ) < λT + Λ , W c (s 0 ,T ) = W c (s 0 ). Consequently, for any s 0 ∈ s ∈ R n |W c (s,T ) < λT + Λ , Eq. (36) can be rewritten as: All that needs to be saved to implement control law (37) is the solution of HJB equation (33) at timeT . Therefore, control law (37) significantly reduces the storage space consumption compared to control law (35).
\\ is a small positive number to ensure λT +Λ > J max .
5: Construct the modified system in discretized form (39) and the modified system in discretized form (41); 6: Let W c and W c be two N x × N y arrays; 7: for i ← 0, ..., N x − 1 do \\ Set the terminal condition of the HJB equation to the endpoint cost function. Copy W c to W c ; 22: end for 23: Construct a bilinear interpolation function W c (.) using W;

Method to Solve HJB Equation
Since Eq. (16) is a special form of Eq. (33), this section introduces the method of solving Eq. (33). The analytical solution of Eq. (33) is usually difficult to obtain. To make matters worse, the solution of Eq. (33) is not everywhere differentiable and sometimes even discontinuous, which leads to the difficulty in obtaining the viscosity solution as well [34]. In the current research, a numerical method based on recursion and interpolation is introduced.

Recursive formula of the solution
Divide the time interval [0,T ] into m subintervals of length ∆t =T m . If ∆t is small enough, u(.) can be regarded as a constant in interval [k∆t, (k + 1)∆t] for k ∈ N, and system (1) can be converted into the following discretized form: Thus, the discrete form of system (8) is: The definite integral of the running cost function (23) The recursive formula of the solution of Eq. (33) is:

Approximation of the solution
This subsection introduces a method based on interpolation to approximate W c (., k∆t). The proposed method is similar to the level set method in this respect. First, a rectangular computational domain, denoted as Ω, needs to be specified in the state space and divided into a Cartesian grid structure. The value of the solution at the grid point is stored in an array with the same dimensions as the state space, and W c (., k∆t) is approximated by the grid interpolation. Take a two-dimensional system as an example, and denote the system state as s = [x, y] T . The pseudocode of the proposed method is shown in Algorithm 1.

Numerical Examples
This section provides two examples, the first one about the reachable set of a two-dimensional system, to visually demonstrate the superiority of the proposed method in terms of storage space consumption. The second example is about the cost-limited reachable sets in a practical engineering problem to demonstrate the generality of the proposed method.
6.1 Two-dimensional system example Consider the following system:ṡ The task is to compute the reachable set corresponding to each time horizon. Table 1 outlines the parameters that are specified for the reachable set computation. As this example computes the reachable sets, the running cost function is set to c(s, u) ≡ 1 and the endpoint cost function is set to Φ(s) ≡ 0. Fig. 5 shows the computation results of the proposed method and compares them with those of the level set method (The computational domain and the number of grids used in the level set method are the same as in our method, and the terminal condition of the HJB equation in the level set method is set as V (s, T 4 ) = l(s)). As can be seen in Fig. 5, the results of the proposed method and those of the level set method almost coincide, which indicates that the proposed method has a high accuracy. Fig. 6 visualizes the storage forms of reachable sets in the proposed method as well as in the level set method. Our method only needs to save function W c (.,T ), while the level set method needs to save V (., 0), V (., T 4 − T 3 ), V (., T 4 − T 2 ), and V (., T 4 − T 1 ), consuming four times more storage space than our method.

Planar flight example
A flight vehicle moves in a plane wind field, the vehicle is modeled as a simple mass point with fixed linear velocity v = 1 and controllable heading angular velocity. The behavior of the vehicle in still air is described by the following equation:ẋ where [x, y] T ∈ R 2 and θ ∈ [0, 2π] are the position and heading angle of the vehicle respectively, and u ∈ U = [−1, 1] is the control input. The wind speed at position [x, y] T is determined by the following vector field: The way the proposed method saves the reachable sets.
(b) The way the level set method saves the reachable sets. Figure 6: Ways to save the reachable sets in different methods.
Then, the behavior of the vehicle in the wind field is: where s = [x, y, θ] T . The target set K and task area Ω depend only on x and y and include any positions in the following rectangular regions in x y plane: where γ is the weight of the length of flight path. The admissible costs are J 1 = 0.75, J 2 = 1.5, J 3 = 2.25, J 4 = 3. We consider two cases: (1) γ = 0 and Φ(s) ≡ 0.

Case (1)
In the first case, the running cost function is constant equal to 1 and the endpoint cost is constant equal to 0, the cost-limited reachable sets degenerate into reachable set. The solver setups of this case are summarized in Table 2.  8 shows the computational results of our method and the comparison with the level set method. As can be seen, in this example, the results of the proposed method and the level set method are also in excellent agreement.

Case (2)
In this case, the cost-limited reachable sets no longer degenerate to reachable sets and therefore cannot be computed using the level set method, but can still be computed using the proposed method. In this case, λ = min Therefore,T is set to 4.1. The solver setups of this case are listed in Table 3. The computation results are shown in Fig. 9. In order to verify the correctness of the results and the validity of the control law in Eq. (37), we test some states in slice θ = π of the state space to determine whether the trajectories initialized from these states can reach the target set before the performance index increasing to the given admissible costs under this control law. The verification results are shown in Fig. 10. It can be seen that the outlines of the cost-limited reachable sets and the borders of the areas marked by the green circles almost coincide, which shows the accuracy of the computation of the cost-limited reachable sets and the validity of the control law in Eq. (37).

Conclusions
This paper proposes a new method for computing reachable sets. In the proposed method, the reachable sets of different time horizons are represented by different non-zero level sets of a HJB equation with a running cost function. This approach significantly reduces the storage space consumption for saving reachable sets and designing control laws.
In addition to being able to solve the classical reachability problems, the proposed method can also solve more generalized reachability problems by setting different operating cost functions and different terminal conditions for the HJB equation. The reachable sets in such problems are referred to in this paper as cost-limited reachable sets In order to overcome the discontinuity of the solution of the HJB equation, the current research adopts a method based on recursion and grid interpolation for solving the HJB equation. The paper concludes with some examples to illustrate the effectiveness and generality of the proposed method.
However, the proposed method has some potential for improvement. The main drawback of the proposed method, and also the main drawback of the level set method, lies in the exponential growth of memory and computational cost as the system dimension increases. Some approaches have been proposed to mitigate these costs, such as splitting the original high-dimensional system into multiple low-dimensional subsystems based on the dependencies between the system states [11,21] or the time-scale principle [28,35]. These approaches will be considered in our future works.