Multi-Agent Distributed Optimal Control for Tracking Large-Scale Multi-Target Systems in Dynamic Environments

This article considers the problem of motion coordination for a multiagent (MA) network whose goal is to track a large-scale multitarget (MT) system in a region populated by dynamic obstacles. We first characterize a density path which corresponds to the expected evolution of the macroscopic state of the MT system, which is represented by the probability density function (PDF) of a time-varying Gaussian mixture (GM). We compute this density path by using an adaptive optimal control method which accounts for the distribution of the (possibly moving) obstacles over the environment described by a time-varying obstacle map function. We show that each target of the MT system can find microscopic inputs that can collectively realize the density path while guaranteeing obstacle avoidance at all times. Subsequently, we propose a Voronoi distributed motion coordination algorithm which determines the individual microscopic control inputs of each agent of the MA network so that the latter can track the MT system while avoiding collisions with obstacles and their teammates. The proposed algorithm relies on a distributed move-to-centroid control law in which the density over the Voronoi cell of each agent is determined by the estimated macroscopic state evolution of the MT system. Finally, simulation results are presented to showcase the effectiveness of our proposed approach.

Multi-Agent Distributed Optimal Control for Tracking Large-Scale Multi-Target Systems in Dynamic Environments Alaa Z. Abdulghafoor and Efstathios Bakolas , Senior Member, IEEE Abstract-This article considers the problem of motion coordination for a multiagent (MA) network whose goal is to track a large-scale multitarget (MT) system in a region populated by dynamic obstacles.We first characterize a density path which corresponds to the expected evolution of the macroscopic state of the MT system, which is represented by the probability density function (PDF) of a time-varying Gaussian mixture (GM).We compute this density path by using an adaptive optimal control method which accounts for the distribution of the (possibly moving) obstacles over the environment described by a timevarying obstacle map function.We show that each target of the MT system can find microscopic inputs that can collectively realize the density path while guaranteeing obstacle avoidance at all times.Subsequently, we propose a Voronoi distributed motion coordination algorithm which determines the individual microscopic control inputs of each agent of the MA network so that the latter can track the MT system while avoiding collisions with obstacles and their teammates.The proposed algorithm relies on a distributed move-to-centroid control law in which the density over the Voronoi cell of each agent is determined by the estimated macroscopic state evolution of the MT system.Finally, simulation results are presented to showcase the effectiveness of our proposed approach.

I. INTRODUCTION
W E CONSIDER a special class of multiagent (MA) (multirobot) control problems for multitarget tracking (MTT) in a dynamic domain populated by static and dynamic obstacles.In our proposed problem, the agents of an MA system must compute microscopic control inputs that will allow them to track as a whole the macroscopic state (density) of a large-scale multitarget (MT) system in an uncertain and dynamic environment while ensuring safety (e.g., collision avoidance) at all times.The problem considered herein can be viewed as a combination of a motion coordination problem with a MTT problem in the presence of dynamic obstacles, which to the best of the authors' knowledge has never been addressed in a holistic way in the relevant literature before.
Literature Review: The main aim of MTT problems is to approximate the trajectories and corresponding probability density functions (PDFs) of a number of mobile targets based on noisy measurement data.MTT problems are also relevant to the problem of estimation of the evolution of the macroscopic state (density) of a large-scale robotic system and vice versa.A significant portion of the latter approaches are based on methodologies from distributed optimal control (DOC) [1], [2] and reinforcement learning (RL) [3].Characteristic applications include control of swarm robotics [4] and MA path planning [2].One key challenge in control and estimation problems for large-scale robotic networks (LSRNs) is dealing with scalability issues that result from the significantly large number of decision variables which appear in the corresponding optimization problems.To overcome such issues, one can utilize methods based on, for example, the meanfield approach [5] or the DOC approach [1], [2], in which an LSRN is characterized macroscopically by the network's density (macroscopic state).The DOC methodology presented in [1] and [2] can find a density path that connects the initial density of an LSRN with its desired final density over a spatial domain populated by static obstacles based on nonlinear programming (NLP) methods.
Another significant challenge in the applications of LSRNs is ensuring their adaptability to environmental uncertainties.In particular, an LSRN should be capable of reacting to unexpected changes in the domain of interest (e.g., pop-up obstacles), and adapt accordingly to these uncertainties to avoid system-level instabilities (e.g., collisions with obstacles).On the other hand, the adaptive DOC (ADOC) approach proposed in [6] can be used to estimate the PDF of a large-scale system of agents performing real-time sensing and navigation tasks while ensuring collision avoidance in a highly uncertain environment.As explained in [6], the ADOC approach, which can be put under the umbrella of RL and approximate dynamic programming (RL-ADP) [7], performs better, in terms of different performance metrics (e.g., time needed for the transition from the original Gaussian mixture (GM) to the final one), than other popular approaches in the field, such as samplingbased path planning (SPP), probability-density-function-based artificial potential field (PDF-APF), or sampling-based artificial potential field (SAPF).The key advantage of the ADOC approach is its ability to adapt to changing environmental information about the dynamic obstacles (online method) in contrast with methods, such as the GM filter [8] or the particle filter [9], which do not update the estimated density in real time.
Ensuring safety (e.g., collision avoidance) is one of the major challenges in motion coordination problems for MA systems.Popular methodologies for obstacle avoidance include control barrier functions [10], potential field-based methods [11], velocity obstacle approaches [12], leaderfollower control strategies [13], and game-theoretic solutions [14].Another way to ensure safety under a distributed implementation framework is to utilize coverage control methods based on Voronoi-like tessellations consisting of obstacle-aware Voronoi cells (OAVCs) [15].In this method, dynamic weights are assigned between agents and nearby obstacles.
Contributions: In this work, a decentralized motion coordination control approach that will allow an MA system to track a large-scale target system (LSTS) while avoiding collisions with (possibly moving) obstacles in their shared space is proposed.Our approach aims at addressing two interconnected problems.The goal in the first problem (Problem 1) is to compute an optimal density path that will transfer the macroscopic state of the LSTS to a desired density while accounting for available environmental information (obstacle map).In our approach, the macroscopic state of the LSTS is defined by a (time-varying) GM density.To address the latter density path planning problem, we will adopt the ADOC method proposed in [6] which is based on tools from optimal mass transport theory [16].The solution approach is implemented online in a centralized way to accommodate instant updates on the environmental state by solving tractable optimization problems.Estimating the density evolution of an LSTS while adapting to environmental information (e.g., moving obstacles) is one of the novelties of our approach given that the relevant work on MTT does not consider the effect of the environmental factors on the targets' evolution.In addition, our approach extends the ADOC approach, which was originally proposed in [6], to handle MTT problems under the (realistic) constraint that the targets must avoid collisions with both dynamic and static obstacles in the uncertain domain.
After obtaining the optimal estimated density path of the targets, we consider a microscopic control problem for the targets of the LSTS (Problem 2).The goal of the latter problem, which is auxiliary to Problem 1, is to compute microscopic inputs for the targets of the LSTS whose application will allow the macroscopic state of the LSTS to stay close (in the L 2 norm sense) to the optimal density path corresponding to the solution of Problem 1 while safety (obstacle avoidance) is ensured at the microscopic level.To solve the latter microscopic control problem, we employ a centralized artificial potential field (APF) control law [6] which, in our approach, is conditioned on the computed density path of the LSTS as well as on the observed positions of its constituents.In comparison with the centralized APF control algorithm proposed in [6], our numerical simulations show that our proposed distributed control law combined with the ADOC approach [6] achieves better tracking performance.
The optimal GM density path which solves the first problem is used in the last problem (Problem 3), which corresponds to a decentralized coverage control problem in an uncertain region populated by dynamic obstacles.The latter problem will provide us with the individual control inputs for the members of the MA network whose application will allow the macroscopic state of the latter network to track the macroscopic state of the LSTS while safety at the microscopic level is ensured at all times.To solve the latter problem, we leverage ideas and tools from distributed coverage control [17], [18], [19], [20] which not only ensure safety but also offer the desired theoretical guarantees of convergence.In particular, our proposed algorithm utilizes a variation of Lloyd's algorithm (the latter algorithm ensures convergence) along with modified Voronoi decompositions of the domain that consist of OAVCs [15], [20] (the use of the latter spatial partitions ensures collision avoidance at all times).The fact that our approach ensures safety at all times differentiate it from the majority of the available MA control methods for MTT, which do not consider or provide safety guarantees (or these guarantees are only local).One key novelty of our approach is its applicability to problems in which the domain is populated by obstacles (static and/or dynamic) with a wide range of sizes (OAVCs were originally designed for problems with static circular disk-shaped obstacles).Our numerical simulations also showcase the robustness of the proposed algorithm to changes in the number of obstacles.Overall, the utilized control and estimation techniques of our proposed approach allow the MA network to adapt its actions in response to instantaneous environmental changes, such as the sudden appearance of obstacles in new locations or the movement of obstacles to new locations.
In addition, our Voronoi-based approach allows the agents to compute their control actions by utilizing only local information, namely, the information encoded in their own (generalized) Voronoi cells (in accordance with the Voronoidistributed control paradigm).The ability of our method to be implemented in a Voronoi-distributed way is in sharp contrast with the centralized ADOC-based control algorithm proposed in [6] (it should be emphasized here that in our work, the ADOC algorithm is used only for the modeling of the targets' movement).In particular, Voronoi-based methods do not require direct and continuous communication among the network's members nor access to global information about the locations of other agents or the targets for computing the (microscopic) control law for each agent.This leads to a significant computational complexity reduction [21] and overall simplification of the design of control algorithms for large-scale MA networks.
In general, the application of the distributed MA motion coordination algorithm will compute individual control inputs for the agents of the MA network which cannot make the macroscopic state of the network precisely track the density path of the LSTS that corresponds to the (centralized) solution to Problem 1.Therefore, Problems 1-3 cannot be treated as independent but they have to be addressed simultaneously.This is another novel feature of our approach which distinguishes it from the relevant literature, in which the MTT and coverage control problems are solved independently.Our combined approach yields a control algorithm that provides a holistic solution to the MA control problem for tracking a large-scale system of targets in the presence of dynamic obstacles.In our proposed algorithm, the reference density path of the LSTS (which is determined by the solution to Problem 1) is periodically updated to better reflect the actual density evolution of the LSTS (the latter is determined by the microscopic actions of the moving targets which correspond to the solution of Problem 2) as well as the current configuration and geometry of the moving obstacles in the domain.In addition, the execution of the individual control actions of the agents of the MA network (determined by the solution to Problem 3) will allow the macroscopic state of the MA network to track the density path of the LSTS (approximated by the density of a time-varying GM) while avoiding collisions with each other as well as the (dynamic and/or static) obstacles in the uncertain domain.
Outline: The remaining part of this article is structured as follows.In Section II, we formulate the density path planning problem and the MA motion coordination problem.The solution approaches to these problems are presented in Section III.Numerical simulations are provided in Section IV.Finally, this article is concluded with a summary of remarks and directions for future research in Section V.

A. Preliminaries
The discrete (or integer) interval from integer j a to integer j b with j a ≤ j b is denoted by [j a , j b ] d .We denote the PDF of a multivariate (k-dimensional) Gaussian distribution N (μ, ) with mean μ ∈ R k and covariance matrix ∈ R k×k with = T > 0 as follows: where | | denotes the determinant of .The m − 1 standard simplex is denoted as m−1 and is defined as the set composed of all vectors λ T .Given two Gaussian PDFs p p = ρ N (x; μ p , p ) and p q = ρ N (x; μ q , q ), then the Wasserstein metric W 2 (p p , p q ) is given by where tr [•] denotes the trace operator.Given a compact set ⊆ R k , the spatial L 2 norm of a continuous function g(x) : → R is defined as follows: g(•) L 2 := ( |g(x)| 2 dx) 1/2 .The ceiling operator is denoted by • .Finally, the inner product g(x), h(x) of two continuous functions g(x) : → R and h(x) : → R defined over a compact set ⊆ R k is determined by g(x), h(x) := g(x)h(x)dx.

B. Problem Statement
Let us consider an MA network that consists of N 1 agents which must track an LSTS comprising N 2 (mobile) targets which are divided into N k groups.Both agents and targets are moving in a set which is populated by dynamic (and/or static) obstacles.
First, we consider the problem of computing and adaptively updating a density path that will take the initial prior GM density p 0 = N 0 i=1 w i 0 g i 0 of the LSTS to a final desired prior GM density p , where N 0 and N T d are the number of Gaussian components g i 0 and g i T d of the initial and final GMs, respectively.The goal of the next problem, which is auxiliary to the first problem, is to compute the microscopic control inputs of the targets which will allow them as a whole to follow the density path obtained as the solution to the first problem while guaranteeing collision avoidance at the microscopic level.The goal of the last problem is to design a distributed motion coordination algorithm that will determine the microscopic control actions of the agents of the MA system that will allow the macroscopic state of the latter to track the macroscopic evolution of the LSTS while ensuring safety at all times based on local information only.Next, we provide more detailed formulations of the aforementioned problems.
whereas O(t) denotes the set occupied by all the dynamic obstacles in the uncertain area at time t, namely, Finally, the set that is not occupied by obstacles (obstacle-free set) is denoted by A(t), that is, A(t) := \O(t).The location, geometry, number of obstacles and their actual layout O(t) are unknown a priori, yet, an initial approximate map denoted by m 0 (q, t 0 ) is provided to the targets as well as to the MA network to help define the approximate initial set occupied by all obstacles denoted by Ô0 = Ô(t 0 ) ⊆ at time t 0 .Thus, the approximated layout of the dynamic obstacles at time t is identified by the function m(q, t) and is referred to as the obstacle map function.Moreover, let M denote the set of map functions m(•, t) which represent all plausible obstacles layouts in .The approximate map function m(q, t), the layout of the obstacles Ô(t) ⊆ and the vector of the obstacle centers Ô(t) = [ô 1 (t); • • • ; ôN O (t)] will be updated instantaneously with time according to the movement of the obstacles in .
In addition, consider a fixed time interval 0 < t density of the LSTS as the targets move from their initial locations at time t 0 to their final locations at time t f .Let the discrete-time instants be indicated as t k = t 0 + k t with k = 1, . . ., T f , then the dynamics of the nth target is described by where the position of the nth target at time t k is given by x k,n = x n (t k ) ∈ and its control input by u k,n = u n (t k ).Moreover, the initial position of the nth target at time t 0 is denoted by x 0 n .The target's dynamics is expressed by the (discrete-time) vector field f (•) (the latter is typically computed after discretizing a continuous-time vector field which is defined such that the existence and uniqueness of solutions to the corresponding ordinary differential equations is ensured in the time interval of interest).Also, the spatial domain can be rendered invariant under the nth target's vector field with the choice of an appropriate control input (in other words, there exists a control input with the application of which the nth target will never leave the domain ).
Since the targets positions x n (t k ) ∈ , for n ∈ [1, N 2 ] d are time-varying vectors, we can deduce that there exists a timevarying PDF that characterizes the macroscopic configuration of the LSTS at time t k .We denote this PDF as p k = p(q, t k ).The aim is to estimate the latter PDF under the assumption that it can be approximated by a GM density function, that is where N k is the number of Gaussian components (each of which corresponds to a different group of targets that the LSTS is divided into), g i (q, t k ) = ρ i N (q; μ i (t k ), i (t k )) symbolizes the approximated Gaussian density of the ith group of targets at time t k and w i (t k ) = w i k its corresponding weight component, where [w 1 (t k ); . . .
Problem 1 (Density Path Planning for the LSTS) Given: A compact set ⊂ R 2 , a team of N 2 targets whose motion is described by the discrete-time dynamics given in (3), an approximate initial obstacle map function m 0 (q, t 0 ), an approximate initial layout of the obstacles Ô0 , a prior PDF p 0 := N 0 i=1 w i 0 g i 0 which describes the distribution of targets at time t = t 0 and a final desired prior density p in the space of GM models (GMMs) that solves the following optimization problem: where C k : P × M → P is the functional control law, that is In this problem formulation, T f is the final stage (corresponding to a terminal time t f > t 0 ), J is the cost function (or performance index), φ is the terminal cost which is defined as and L is the running cost which is defined as where ) denote, respectively, the joint density of the probability vectors/weight vectors (w T f , w T d ) and (w k , w k+1 ) and (•, •) the corresponding density space, m k = m(q, t k ) denotes the obstacle map function at the discrete time step t k , and •, • denotes the inner product on the domain , which represents the probability that the targets at time t k+1 are deployed within the estimated obstacle layout Ô(t k ).
Remark 1: It is worth mentioning that despite the fact that the density path of the LSTS (solution of Problem 1) is updated with respect to the approximated obstacle layout that changes with time (dynamic obstacles) in order to reduce the mass concentration of the LSTS over areas occupied by obstacles (obstacle avoidance at the macroscopic level), there are no guarantees of obstacle avoidance at the microscopic level.Finding microscopic control inputs that will ensure safety will be the goal of Problem 3, which we will introduce later on.
Let us denote by p T (q, t k ) the density of the targets over the domain , which is characterized as follows: One possible way to compute the GM p T is by using the GMM likelihood optimization (GMMLO) algorithm.The GMMLO algorithm, which is based on the expectation-maximization (EM) algorithm, obtains the estimates of the means μ j T (t k ), covariance matrices j T (t k ), and mixing proportions w j T (t k ), for j ∈ [1, N T ] d , which maximize the likelihood function.Thus, provided that the number of Gaussian components N T and the point-set T(t k ) consisting of the targets' locations in at time t k ≥ 0 are given as inputs to the GMMLO algorithm, then the output will be a GM density p T (q, t k ) which consists of N T Gaussian mixands, along with a probability vector (known as mixing proportion/weight vector) w T (t k ) ∈ m−1 for all t k ≥ t 0 .Next, we will introduce the microscopic control problem for the LSTS, which seeks for the individual actions of the targets that will induce the density path (macroscopic state trajectory of the LSTS) that solves Problem 1.
Problem 2 (Microscopic Control of LSTS With Safety Guarantees) Given: An approximation of the map function m(q, t k ), the approximate obstacle layout Ô(t k ), the vector Ô(t k ), the observed targets' joint microscopic state at time t k , and the optimal density path p * k = p * (q, t k ) that solves Problem 1. Goal: Obtain microscopic control laws u n (•), for n ∈ [1, N 2 ] d , that will ensure safety at all times and collectively induce a macroscopic control with the application of which the macroscopic state of the LSTS will be able to track the estimated optimal reference density path p * k given the obstacle Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.map function m(q, t k ) ∈ M and the dynamic obstacle layout Ô(t k ).In particular lim Remark 2: It is worth mentioning that the density path that solves Problem 1 alone can only ensure that the mass concentration over areas occupied by obstacles is small but cannot guarantee obstacle avoidance at the microscopic level.The objective in Problem 3 is to compute microscopic control actions which will make the targets move in accordance with the optimal density path (solution to Problem 1) while guaranteeing collision avoidance at the microscopic level.
A high-level description of Problems 1 and 2 is given in Fig. 1(a).In this figure, the targets' locations are represented by the red stars, whereas the Gaussian density components of the optimal GM density path which describes the PDF (macroscopic state) of the LSTS are illustrated by the red dashed elliptic footprints.In addition, the gray-filled circles represent the obstacle layout.The objective in Problem 1 is to estimate and compute the GM density path (at each time instant, the GM PDF is the weighted sum of the Gaussian densities which are indicated as dashed red elliptic footprints) that will transfer the macroscopic state of the LSTS from an initial density to a terminal density given the obstacle layout (indicated by the gray area).Problem 2 seeks for the microscopic actions of the individual targets that will realize the previous density path while ensuring safety (obstacle avoidance) at all times.
2) Multiagent Motion Coordination Control for Tracking LSTS in the Presence of Obstacles: In this section, we introduce the motion coordination control problem which seeks for the microscopic control inputs for the agents of an MA network that will allow them to macroscopically track the computed/estimated optimal GM density path of the LSTS (solution to Problem 1) while assuring collision avoidance at the microscopic level.Therefore, it is expected that the MA network's density p A (q, t) will eventually approach (in the L 2 sense) the optimal GM PDF of the LSTS, p * (q, t), while avoiding the dynamic obstacles (cluttered uncertain environment).In addition, the agents move in such a way that their positions conform with the time-varying GMM density p(q, t).The weight components ( ŵi ) * for all i ∈ [1, N k ] d associated with the Gaussian densities g i for all i ∈ [1, N k ] d forming the GMM p(q, t) which is defined in ( 4) are updated as new information becomes available to the agents.
Consider a team of N 1 homogeneous mobile agents distributed over an uncertain cluttered domain , which includes N O dynamic obstacles that are centered at O(t) and occupy the set O(t) whose approximate layout Ô(t) ∈ at each time t is given by the obstacle map function m(q, t).The motion of the individual mobile agent is determined by where the position of the ith agent in at time t is denoted as T is its velocity vector (control input).The set of positions of the N 1 agents at the instant t is presented by Furthermore, we define an approximation of the MA system's density in terms of a GM density which is denoted as p A .For the estimation of p A , the set of the agents' locations For the computation of the GM p A (q, t), one can use, for example, the GMMLO algorithm as we did for the computation of p T (q, t) (9).The precise formulation of the MA target tracking problem is given next.Problem 3 (MA Motion Coordination for Tracking an LSTS in the Presence of Obstacles) Given: An approximation of the map function m(q, t), the vector Ô(t), the approximate obstacle layout Ô(t), the set of initial positions of the N 1 agents and the optimal density path function p(q, t) which is the solution to the first problem (Problem 1).Moreover, we have a (time-varying) partition {A i : i ∈ [1, . . ., N 1 ] d } of the obstacle-free set A(t) ⊂ , where p i (t) ∈ A i (the ith obstacle-free cell A i is associated with the ith agent of the MA network) and a locational cost function H(p, t; A(t)) which is defined as q − p i (t) 2 p * q, t dq.(14) Goal: Obtain a distributed controller that minimizes the locational cost characterized in (14) and forces the agents to track macroscopically the PDF of the very-large-scale system of N 2 targets while guaranteeing collision avoidance with all the dynamic obstacles in the uncertain region , that is where Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Remark 3: Problem 2 seeks for the microscopic control actions whose execution will allow the MA network to track macroscopically the LSTS by (asymptotically) minimizing the locational optimization cost H(p, t; A(t)) which is defined in (14).The control inputs that achieve the latter goal will make the density of the latter network approach (in the L 2 sense) the optimal density path of the targets asymptotically, that is, there exists a sufficiently small ε ≥ 0 such that lim sup Fig. 1(b) provides a high-level illustration of Problem 3 given the solution of Problems 1 and 2. Thicker and darker footprints characterize the Gaussian densities accompanied with higher estimated weights in the relative GM distribution implying that more agents are attracted toward the group of targets that has higher mixing proportions accompanied with their own Gaussian components in the corresponding GMM.

A. Solution to the Optimal Density Path Planning for LSTS
In this section, we will discuss the solution approach for Problem 1.We will utilize the ADOC approach proposed in [6] to compute a density path for the LSTS, p(q, t) in the space of GM probability densities.At a given time t = τ , p(q, τ ) corresponds to a GM PDF whose Gaussian components are associated with different subgroups of targets (the relative sizes of the different subgroups are reflected on the corresponding estimated weight/mixing proportions of the GMM).The ADOC approach, which can be categorized as an RL-ADP approach [7], provides a density path that is updated in accordance with the approximated obstacle map m(q, t) at each time instant.
1) Approximating the Dynamic Obstacle Layout and Map Function: As mentioned in Section II-B1, the approximate set occupied by all obstacles denoted as Ô(t) is characterized by the obstacle map function m(•, t) ∈ M which represents the approximate layout of obstacles in the domain .There are several techniques for characterizing the latter map function, such as the Hilbert occupancy map [6], [22], the Gaussian process occupancy map, and the traditional occupancy.In this work, we will utilize the Hilbert occupancy map [6], [22].In particular, the map function m(q, t) is described as a binary occupancy map defined on the domain at a certain time t, which is derived from the Hilbert occupancy map h(q, t) ∈ [0, 1] defined in [6] and [22], as follows: More details on the latter technique can be obtained from [6], [22], and [23].
2) Adaptive Distributed Optimal Control Approach: The macroscopic goal of the LSTS is to generate and adaptively update a density path that will transfer its initial GM density p 0 to a desired final GM density p T d while accounting for the current environmental conditions (e.g., dynamic obstacles) which are continuously updated based on new observations.Since the obstacle map is time varying, one cannot obtain the density path of the LSTS that minimizes the objective function J defined in (5) of Problem 1 by directly applying the standard DOC approach presented in [1], [2], and [24], which only works in the case of static obstacle maps.Instead, in this work we formulate the LSTS density path planning problems as a relevant RL-ADP problem which seeks for an optimal control functional (or macroscopic controller) that minimizes the performance index J.Placing the problem under the umbrella of RL-ADP problems allows one to obtain an optimal control functional (macroscopic controller) in an adaptive way.The control functional (or macroscopic controller) C(•, m k ) is updated in real time with regards to the dynamic obstacle layout defined by m k at time k and its application generates the targets' optimal PDF p * k+1 .Consequently, the LSTS density will move to the optimal PDF p * k+1 at time step (k + 1).The realization of the latter density evolution will be enabled by appropriate microscopic control inputs executed by the targets of the LSTS, which will be defined later in Section III-A3.In this approach, we assume that the PDF of the LSTS belongs to the GMM space such that, , where N k and N T d are the number of Gaussian mixands at step k and at the final desired goal, respectively, Thus, given the PDF of the LSTS at time step k, p k = N k i=1 w i k g i k , the control functional (macroscopic controller) can be described as

the probability vectors (weights vectors) and g
Therefore, given p k and m k , the control functional given in the right-hand side of ( 18) is parameterized by the tuple, Following all the assumptions and simplifications in [6], the computation of the targets' optimal PDF (density) can be carried out as follows.First, let us define a trajectory set as Tk = {(g ı k+1 , . . ., , where each element of the set defines a path of Gaussian mixands indicated by ).Thus, we can define the minimum cost function Lı,j k of the path of Gaussian components T ı,j k at the kth stage by Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Let us denote the Gaussian components' trajectory by , then the cost function of T i,ı k = (g i k , g ı k+1 ) can be denoted as L i,ı k and defined with respect to m k as Now, let us introduce constraints on the joint probabilities as follows: then the optimal control functional (macroscopic controller) can be computed by solving the following finite-dimensional optimization problem: be a minimizer of ( 23), then the optimal PDF of p(q, t) at time step k + 1, p * k+1 , is given by Computationally tractable methods to implement the previously described ADOC approach can be found in [6].
3) Optimal Microscopic Control for Each Mobile Target of the LSTS: With the knowledge of the optimal evolution of the macroscopic state of the LSTS system (note that at time step k, we can compute the LSTS PDF p * k+1 , corresponding to time step k+1), each target must utilize a local microscopic control law that will induce the transition of the LSTS's macroscopic state to p * k+1 while ensuring safety at the microscopic level at all times.To compute the microscopic control actions for all the N 2 targets, we utilize the APF path planning (centralized) approach proposed in [6] and [23].Although the focus of this work is to solve the MA control problem in a distributed way, we still need to briefly describe how one can solve the microscopic control problem for the targets of the LSTS for completeness of our approach but also in order to establish a framework for a fair comparison between our approach with those proposed in [6] and [23] when it comes to the MA motion coordination problem. Let denote the concatenation of all the microscopic inputs that will induce a safe transition of the macroscopic state of the LSTS from its original density to the desired PDF p * k+1 .It should be noted that the realization of the (centralized) microscopic control laws proposed in [6] and [23] requires knowledge of the LSTS's joint microscopic state X k := [x 1 (t k ); • • • ; x N 2 (t k )] observed at the kth step of time.Then, given the optimal functional control law p * k+1 at time step k and the targets' joint microscopic state X k , the optimal microscopic control inputs for the targets can be obtained in a centralized way by the APF algorithm.In particular, the total LSTS potential field is characterized as the weighted sum of the attractive and repulsive potentials defined in [6], [23] U tot = w attr U attr + w rep U rep (25) where w attr ≥ 0 and w rep ≥ 0 are tuning weights (user defined) indicating the desired tradeoff between the repulsive fields U rep and attractive forces U attr .The optimal microscopic control action of the nth target at the kth time step is computed as The reader can refer to [6] and [23] for more details.

B. Solution of the Multiagent Motion Coordination Control Problem for Tracking the LSTS in the Presence of Obstacles
In this section, a distributed motion coordination control law for the MA system that will enable its agents to track the LSTS is proposed.In particular, the network's agents will be forced to track and asymptotically converge to the centroids of their (generalized) Voronoi cells, which evolve with time in accordance with the GM density path p * (q, t) of the LSTS (solution of Problem 1), under the safety requirement that every agent must avoid collisions with all the dynamic obstacles they may encounter in and the other agents of the MA network.
The backbone of our approach is a modification of Lloyd's algorithm for optimal deployment of MA systems whose main steps can be summarized as follows: 1) compute a generalized Voronoi partition of the spatial domain comprised of the OAVCs [15] generated by the current locations of the agents while accounting for nearby obstacles in and 2) compute the agents' individual control actions based on a move-to-centroid logic (each agent tries to reach the centroid of its own OAVC without leaving the latter set).The execution of the algorithm will iteratively alter the position of each agent, which will move toward the centroid of its own OAVC.In this approach, collision avoidance is guaranteed because each agent is constrained to move inside its own (generalized) Voronoi cell that no other agent can enter.
1) Obstacle-Aware Voronoi Cells: One method to ensure that the agents of the MA network will avoid collisions with each other and the dynamic obstacles in is to create a safe area around each agent by generating a buffered Voronoi cell (BVC) [25].The latter is an adjusted Voronoi cell whose edges are computed within a safety radius (weight) around each agent such that its body is completely within its cell and its centroid is in the BVC.To generate buffered cells, we leverage the approach proposed in [15] for the computation of generalized Voronoi diagrams that differentiate between other agents and the dynamic (moving) obstacles Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
within the uncertain environment (recall that each agent must avoid collisions with the other agents from the same network and the obstacles).The utilized approach combines standard Voronoi diagrams [26] with the modified Voronoi tessellations proposed in [15].Given the fact that some obstacles in the domain can be dynamic, the agents will compute modified Voronoi cells by dynamically assigning weights for each obstacle such that the boundaries of the Voronoi cells (which determine the safe area for each agent) are tangent to the obstacle boundaries and never intersect.The latter reshaped Voronoi cells are referred to as "OAVCs" following [15].Given an approximate obstacle map function m(q, t), which describes an approximate dynamic layout of the moving obstacles Ô, we define the ith OAVC (corresponding to the ith agent) as follows: where w i,j denotes the dynamic weight which is assigned values such that the boundary lines of the modified Voronoi cell be tangent to the dynamic obstacles; in particular The proposed definition of the OAVC encompasses two techniques, namely, the standard Voronoi decomposition that produces the cells assigned to the agents, and the modified partitioning which adds dynamic weights determined by the moving obstacles encountered by the agents in the cluttered domain in order to ensure safety.
It is worth mentioning that in order for the agents to compute their OAVCs, they are only required to know the radii and the positions (centers) of the dynamic obstacles that are close by at each time t (this information is encoded in the obstacle map function m(q, t) and the obstacle layout Ô).The knowledge of the locations of other distant obstacles or other agents is not required.Moreover, by utilizing the dynamic weights in computing the OAVCs, the largest possible convex cells around the agents which are near the dynamic obstacles are generated.Therefore, by utilizing the proposed move-tocentroid distributed controller, the agents are guaranteed to stay inside their own cells whereas at the same time, the MA network's density tracks the LSTS density path [estimated GM PDF p(q, t)], characterized in (4).More precisely, the agents will asymptotically converge to the centroids of their OAVCs whilst avoiding collisions with the moving obstacles and other agents.
Next, we will introduce the quantities that will be utilized in our proposed distributed control algorithm [17], [18], [19], [20] which will be presented later on.Given the density path of the LSTS p * (q, t) (solution to Problem 1), the mass M i (A i , t) and the centroid C i (A i , t) of the ith OAVC accompanied with the ith mobile agent are given by where their time derivatives are computed as follows: The distributed feedback control law must ensure that the configuration cost H(p, t; A) defined in ( 14) is a nonincreasing function of time along the agents' trajectories.Using the definitions of the mass of the OAVC in (29a) and its centroid in (29b), it follows that: ∂H p, t; A(t) where the mass M i > 0 and ([∂H(p, t, A(t))]/[∂p i ]) = 0 when p i = C i .Then, the time derivative of the configuration cost H(p, t; A(t)) along the agents' trajectories is given by Therefore, to assure that Ḣ(p, A, t) is negative semi-definite, our feedback control law is designed as for k 0 > 0 and k 1 ≥ 0. The correctness of the controller (33) will be proved in Section III-C.Because only local information is needed for each agent to determine its own OAVC (such as the total mass and the centroid of its OAVC), and the latter can compute its control action based on information related to its own cell, we say that the proposed controller is "Voronoi distributed" or decentralized [27].

C. Analysis of Proposed Solution to Multiagent Motion Coordination Problem
We begin this section by proving that the proposed control law in (33) renders the time derivative of the configuration cost Ḣ(p, t; A(t)) a negative semi-definite function.Thus, reflecting the arguments in [17], [18], [19], [20], and [28], we will show that when |([∂ p * (x, t)]/∂t)| is bounded from above, then a distributed feedback controller that solves Problem 2 can be designed.
Proposition 1: Assume that there exists A 1 ≥ 0 such that Furthermore, the proportional gain k 1 of the proposed controller in (33) is assumed to satisfy the following inequality: Then, the controller proposed in (33) renders the time derivative of the configuration cost H(p, t; A(t)) nonpositive along the agents' trajectories, namely Proof: The proof is excluded since it is similar to the proof of Proposition 2 in [17].
Following [17], [18], [19], and [20], we will next show that the agents of the MA network will asymptotically converge to the (moving) centroids of their OAVCs (critical points of the locational/configuration cost function H).The proof of convergence is based on Barbalat's lemma for stability analysis [29].
Lemma 1 (Barbalat's Lemma for Stability Analysis [29]): Consider a dynamical system described by ẋ = f (t, x) and a function V(p, t) which satisfies the following properties.
2) V(p, t) is negative semi-definite (along the trajectories of the dynamical system ẋ = f (t, x)). 3) V(p, t) is uniformly continuous in time.Then, lim t→∞ V(p, t) = 0. Proposition 2: The ith agent steered by the feedback controller (33) will converge asymptotically to the Voronoi centroid C i of its corresponding OAVC, , where H(p, t; A(t)) is defined in (14).The proof of convergence relies on showing that the three conditions described in Lemma 1 are satisfied.
1) The lower boundedness of V will be verified.Because p(q, t) > 0, for all (q, t) ∈ R 2 × [t 0 , ∞), it follows readily that the locational cost function H(p, t; A(t)) defined in ( 14) is non-negative.Hence, V(p, t) ≥ 0 and thus V is lower bounded.
3) The fact that V(p, t) is uniformly continuous in time is verified.The latter is proved by showing that V(p, t) = Ḧ(p, t; A) is bounded and finite.Consider then following (32), it can be concluded that: As shown in [17], [18], [19], and [20], the terms that appear on the right-hand side of (38) which are comprised of the quantities M i , C i , (∂M i /∂t), (∂C i /∂t) and ṗi (that are presented in (29a)-(30b) and (33), sequentially) are finite and bounded functions of time.Thus, it is inferred that V(p, t) is also bounded.We have shown that V(p, t) satisfies all the three conditions of Barbalat's lemma and therefore, we have lim t→∞ V(p(t), t) = 0. Thus, it can be concluded that the agents will converge to the time-varying centroids of their respective OAVCs.
The next proposition states that the proposed MA control algorithm ensures safety at all times.
Proposition 3: For the MA network whose ith agent is initially located at p i 0 / ∈ Ô(t 0 ) and its associated modified OAVC A i is characterized as in (27), the distributed feedback controller proposed in (33) guarantees that for all t ≥ t 0 .In addition Proof: It follows from the definition of the OAVC A i given in (27) that the latter set is convex and nonempty.Therefore, the centroid of A i will lie in the interior of the convex hull generated by the vertices of A i .Therefore, by driving the agents toward the centroids of their associated OAVCs, the distributed control law given in (33) will not allow them to leave their own OAVC (safe set).Moreover, since the intersection of A i (t 0 ) with the set occupied by any obstacle for the given obstacle layout Ô(t 0 ) [corresponding to the given obstacle function m(x, t 0 )] is empty, then all successive calculations of A i will not intersect the approximated obstacle layout Ô(t) for all t ≥ t 0 .

D. Combined Solution Approach for Tracking LSTS by Multiagent Network
Next, we will describe in detail the proposed solution approaches for Problems 1-3 and how one should combine them to obtain a holistic solution to the MA control problem for tracking a large-scale system of targets in the presence of dynamic obstacles.In the first step of the method, the dynamic obstacles in the uncertain domain are estimated by acquiring an approximation of the obstacle layout Ô(t) and the obstacle map function m(q, t) using the Hilbert Occupancy Mapping technique discussed in Section III-A1.Next, we utilize the latter map function to estimate the optimal density path of the LSTS which will appear in the formulation of the locational optimization problem (Problem 3).
By using the discretized form of the macroscopic cost function in (5), we compute the optimal control functional (macroscopic controller) p * (q, t k+1 ) := p * k+1 required to update the target's density p(q, t k ) using the ADOC approach at each time t k , where t k := t 0 + k t with k ∈ {1, . . ., T f } and t > 0 (fixed time step).After obtaining the optimal joint distribution π * k , which is defined in (23), the GM optimal density path p * (q, t k+1 ) := p * k+1 in ( 24) is computed at t k by utilizing the solution to Problem 3 which corresponds to the optimal microscopic control laws u k,n for the targets of the LSTS which are defined in (26).Subsequently, we solve Problem 2 by taking p * (q, t) (solution to Problem 1) to be a reference density path which will be used for tracking purposes by the MA network.By applying the microscopic inputs induced by the proposed MA motion coordination algorithm for target tracking, the MA system's density p A (q, t) defined in (13) will approach (and ideally converge to) the reference density Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.for i = 1 : N 1 do end for 23: end while p * (q, t) (see Remark 1) and its agents will avoid collision with the dynamic obstacles at all times.
The exact steps of the proposed solution approach (in the form of a pseudocode) are given in Algorithm 1.

IV. NUMERICAL SIMULATIONS
Next, we will present numerical simulation results that will showcase the effectiveness of our proposed method in solving Problems 1-3 by means of numerical simulations.We conducted our simulations using N 1 = 100 agents and N 2 = 500 targets with single-integrator dynamics, that is, ẋn (t) = u n (t), where x n is the position vector of the nth target in int( ), where := [0, 20] × [0, 20], and u n is its linear velocity vector.The targets are required to move from their prior known initial density p 0 = N 0 i=1 w i 0 g i 0 , which corresponds to a GM with N 0 = 4, mean vectors equal to [1.5; 7], [4; 1.5], [1.5; 4], [7; 1.5], covariance matrices equal to 0.5I 2 for all the GM components and weight vector w 0 = [0.3;0.2; 0.2; 0.3], and move to the desired final density p , which is corresponds to a GM that has N T d = 4 components with mean vectors equal to [14; 12], [2; 17], [7; 18], [12; 16], covariance matrices equal to 0.5I 2 and w T d = [0.2;0.3; 0.3; 0.2].The initial density of the MA network p A (q, t 0 ), characterized in (13), has N A = 3 components with mean vectors equal to [8.5; 1.5], [1.5; 1.5], [1; 5], covariance matrices equal  In our simulations, the time step we use for integrating the equations of motion is dt = 0.9, whereas t k = t 0 + k t for k ∈ {1, . . ., T f } and t = 0.002.Fig. 2 demonstrates the movement of the targets driven by the optimal microscopic controller in (26) to avoid all dynamic obstacles (indicated by the gray areas) as they move from an initial GM distribution to a final GM goal distribution following the optimal path and estimated density in (24) generated by the ADOC approach which continuously updates the optimal path online based on the approximated layout of the obstacles Ô(t).Fig. 3 shows the evolution of the Gaussian PDF of each group of targets of the LSTS which form the GM density of the latter network.Each agent is steered by the optimal microscopic control law given in (26) whereas the optimal density path of the LSTS is generated by the optimal macroscopic control law   computed by the ADOC approach.Furthermore, Fig. 3 shows the 2-D distribution of the targets and the density evolution of the LSTS as the targets maneuver around the dynamic gray obstacle area.Fig. 4 shows a 3-D graph of the evolution of the weighted Gaussian components of the estimated optimal GM distribution of the very-large-scale system of targets over the uncertain domain at distinct time steps.
The trajectories of the targets generated by the ADOC approach (driven by the optimal microscopic inputs) are plotted in Fig. 5.It is observed that the MT system's density has been successfully steered from the initial GM density to the goal GM density.In addition, Fig. 6 demonstrates the Fig. 7. Generalized Voronoi diagrams (composed of OAVCs) generated by the agents' locations at (a) t = 0, (b) t = 300 t, (c) t = 400 t, and (d) t = 720 t.The agents are denoted as small triangles, the targets as small blue circles, the OAVC as blue polygons, and the (moving) obstacles as gray areas.
efficiency of the ADOC methodology measured in terms of the energy-cost per [ kg ], Ē(k), and the distance-to-go, D(k), which are defined by the following formulas [6]: where η = (1000/3600) 2 is utilized for unit conversion.Fig. 7 illustrates the modified Voronoi cells (OAVC) displayed as filled blue polygons (the small triangles indicate the locations of the agents associated with the OAVCs).The agents in this figure are tracking the centroids of their OAVCs which evolve with time t in accordance with the evolution of the approximated optimal density p * (q, t) of the LSTS.Moreover, Fig. 7 shows how the agents follow the targets (presented as blue circles) by tracking their optimal estimated density p * (q, t) while avoiding collisions with the dynamic obstacles (presented by the gray area) which change configuration with time and are continuously predicted by the Hilbert occupancy mapping approach.
Fig. 8 displays the paths of the agents driven by the controller given in (33) which converge to the critical configurations of the locational optimization problem (Problem 2).Furthermore, in Figs.7(d) and 8, it is observed that the agents of the MA network follow the optimal density path of the LSTS.Hence, as the agents of the network approach their limit locations, the density of the MA network p A (q, t), which is characterized in (13), approaches the optimal density path of the LSTS p * (q, t), which is defined in (4) (the latter density path was used as a reference density path for tracking purposes for the MA network).The proposed controller's (33) performance with regards to minimizing the locational cost Fig.8. Agents' collision-free trajectories while tracking the mobile targets at which the initial and terminal locations of the agents are denoted by small circles and small diamonds correspondingly, and the gray regions depict the obstacles at a single time step.

H(p, t; A(t)
) is shown in Fig. 9. Fig. 9(a) illustrates how the locational cost H(p, t; A) decreases with time.Furthermore, it can be observed in Fig. 9(b) that Ḣ(p, t; A(t)) remains nonpositive throughout the entire period of time, which indicates that the cost is a nonincreasing function of time which is in agreement with the conclusions presented previously (see Proposition 1 and its proof).Fig. 10 shows the difference between the weights ( ŵi k ) * for all i ∈ [1, N T d ] d defined in (21), approximated by the ADOC approach and the mixing proportions of the GM density of the agents p A (q, t) which divided themselves in four teams over the uncertain domain as they track the GM density path of the LSTS p * (q, t).The weights (mixing proportions) of the GM density of the MA network p A (q, t) (that represent the mixing proportions of the four groups of agents) were acquired by utilizing the GMMLO algorithm, which was executed in MATLAB 2019 by applying the Fit GMM to Data function.We observe that the differences in the mixing proportions are decreasing with time and eventually become negligible, which implies that the agents have managed to successfully disperse themselves into four groups in accordance with the estimated weights of the Gaussian components of the density path of the LSTS.Each team, which comprises a different number of agents, has an associated Gaussian PDF and a corresponding weight.Furthermore, the decrease in the difference between the mixing proportions confirms that the agents have successfully tracked the GM density path of the LSTS.
The rate of convergence of our proposed approach based on the control law given in (33) is illustrated in Fig. 11.The convergence rate is determined by the ratio 12 presents the rate of convergence of our proposed method evaluated with respect to the controller given in (33) for different values of the time step dt used for the discretization of the agents' dynamics.Fig. 12 shows that the use of bigger time step dt results in an increase in the convergence rate, which means that the agents move faster toward the desired locations while incurring less cost.However, with the latter choice, more fluctuations occur before the system settles as the convergence rate gradually approaches a constant value (which is close to zero).Alternatively, lower values of dt result in low convergence rate meaning that the system dynamics are slow and the incurred cost is high.In addition, we observe that after settling at a constant value, the convergence rate increases again at the final time steps.Therefore, our chosen value dt = 0.9 meets all the criteria as it results in a convergence rate that is not low (thus, the system evolves fast with an acceptable cost), does not result in high fluctuations, and eventually, the convergence rate settles at zero.This can be similarly applied to the choice of the time step δt in the discretization of the task time interval of the ADOC approach and targets' dynamics.Percentage error in the distance (calculated by the spatial L 2 norm) between the agent's PDF p A (q, t) and the approximated density of the LSTS p * (q, t) [Fig.13(a)] and between the target's density p T (q, t) and the approximated PDF of the LSTS p * (q, t) [Fig.13(b)].
Thus, the time step was chosen to result in the best possible performance (evaluated in an empirical way) of the LSTS generated by the ADOC approach in terms of the average-energy cost per kg and distance-to-go.Finally, it should be mentioned that the time steps chosen should be compatible with the speed of the moving obstacles such that collisions between agents and targets with those dynamic obstacles are avoided at all times.
Fig. 13(a) illustrates the evolution with time of the distance between the MA system's density p A (q, t) in ( 13) and the density of the LSTS p * (q, t) measured in terms of the spatial L 2 norm (density tracking percentage error).It can be concluded that the density tracking percentage error remains small, indicating that the distribution of the MA network follows closely the estimated GM density path of the LSTS, which is in agreement with the key points of the discussion given in Remark 3 [refer to (16)].Moreover, Fig. 13(b) shows the time evolution of the distance between the targets' distribution p T (q, t), which is defined in (9), and the estimated density of the LSTS (reference tracking GM density path) p * (q, t) measured again in terms of the spatial L 2 norm.It is clearly shown in Fig. 13 that the proposed move-to-centroid Voronoi-distributed controller u i (33) performs better, in terms of density tracking percentage error, than the centralized control law (26) proposed in [6] whose performance is illustrated in Fig. 13(b).Hence, utilizing the ADOC approach together with our designed control law results in better tracking performance in comparison with the APF approach used in [6].

V. CONCLUSION
In this work, we presented an MA motion coordination approach for tracking large-scale systems of moving targets in an uncertain environment populated by static and dynamic obstacles.We first computed a density path which corresponds to a reference time evolution of the macroscopic state of an LSTS which was later utilized for tracking purposes.The latter density path is represented by a (time-varying) GM PDF whose Gaussian components and associated weights are generated by utilizing an ADOC approach.Subsequently, we developed an MA motion coordination algorithm which utilizes the reference GM density path together with Lloyd's coverage algorithm and a decomposition of the domain into a collection of OAVCs.The latter algorithm allowed us to compute the individual (microscopic) inputs for each agent of the MA network based on local information while guaranteeing collision avoidance at all times.In our future work, we will try to develop more computationally tractable approaches for the solution of the density path planning problem than the ones proposed herein by utilizing covariance steering methods based on the Wasserstein distance [30] and extending their applicability to problems over the space of GMs (a key feature of the latter methods is that they do not require spatial discretization and thus, they can be significantly more computationally tractable than the methods considered herein).
Alaa Z. Abdulghafoor received the B.Sc. degree (with Highest Hons.) in electrical engineering from The Petroleum Institute (currently known as Khalifa University), Abu Dhabi, UAE, in 2015, and the M.Sc.and Ph.D. degrees in aerospace engineering in the area of controls, autonomy, and robotics from The University of Texas at Austin, Austin, TX, USA, in 2020 and 2022, respectively.
Her research interests include control of autonomous agents and multiagent networks, optimization and optimal control, estimation, and path planning.
Efstathios Bakolas (Senior Member, IEEE) received the Diploma degree (with Highest Hons.) in mechanical engineering from the National Technical University of Athens, Athens, Greece, in 2004, and the M.S. and Ph.D. degrees in aerospace engineering from the Georgia Institute of Technology, Atlanta, GA, USA, in 2007 and 2011, respectively.
He is currently an Associate Professor with the Department of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, Austin, TX, USA.His research is mainly focused on control of uncertain and stochastic systems, data-driven control of complex systems, optimal control theory, decision making and control of autonomous agents, and multiagent networks and differential games.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 1 .
Fig. 1.High-level descriptions of Problems 1 and 2 [Fig.1(a)] and Problem 3 [Fig.1(b)].The LSTS is represented by three red dashed ellipses each of which is associated with a Gaussian density (a component of the GM density of the LSTS).An MA network (denoted by the black triangles) tries to track macroscopically the LSTS (red stars).It can be observed that as the three groups of targets move away from each other, so do their associated Gaussian components.Accordingly, the MA network splits into three subgroups each of which follows a different team of targets.
are the collections of all the Gaussian components (PDFs) specified by the means, μ i k and μ j T d , and the covariance matrices, i k and j T d , respectively.Therefore, the tuples of parameters, p k = (N k , g k , w k ) and p T d = (N T d , g T d , w T d ) characterize the PDFs p k and p T d , respectively.

Algorithm 1 1 : 5 :
Motion Coordination Algorithm for Tracking an LSTS Inputs: number of agents N 1 , number of targets N 2 , domain , time step dt for discretization of agents' dynamics, time step t, m 0 (q, t 0 ), Ô0 , p 0 , p T d , controller gain k 0 , initial time t 0 .2: Let k = 0 3: while (k ≤ T f ) and (p k = p T d ) do 4: Let t k = t 0 + k t Approximate m(q, t k ) time step k (k = k + 1) 13:

Fig. 2 .
Fig. 2. Positions of the targets (indicated by the blue circles) as they move from an initial GM density to follow the estimated optimal path density and eventually reach a goal final GM density while avoiding the (moving) obstacles (indicated by the gray areas) (a) t = 0, (b) t = 300 t, (c) t = 400 t, and (d) t = 720 t.

Fig. 3 .
Fig. 3. Evolution of the targets' PDFs which form the GMM obtained from the ADOC approach.The gray areas indicate the sets occupied by the dynamic obstacles (a) t = 0, (b) t = 300 t, (c) t = 400 t, and (d) t = 720 t.

Fig. 4 .
Fig. 4. 3-D graphs of the weighted Gaussian components of the estimated optimal GM distribution of the LSTS over the spatial domain at different time steps (a) t = 0, (b) t = 300 t, (c) t = 400 t, and (d) t = 720 t.

Fig. 5 .
Fig.5.Optimal trajectories of the targets generated using the ADOC method at which the initial locations of the targets are denoted by small circles whereas the final positions by small diamonds, and the gray regions depict the obstacles at a single time step.

Fig. 6 .
Fig. 6.Performance of the LSTS generated by the ADOC approach in terms of (a) average energy-cost-per-kg and (b) average distance-to-go.

Fig. 10 .
Fig. 10.Difference between the estimated mixing proportions ( ŵi k ) * (applying ADOC) of the four GM components of the targets' estimated density GMM and the mixing proportions of the GM density p A of the agents.

Fig. 13 .
Fig. 13.Percentage error in the distance (calculated by the spatial L 2 norm) between the agent's PDF p A (q, t) and the approximated density of the LSTS p * (q, t) [Fig.13(a)] and between the target's density p T (q, t) and the approximated PDF of the LSTS p * (q, t) [Fig.13(b)].

14 :
Generate A i (t k ) 16:for t 2 = t k : dt : t

k+1 do 17:
Compute M i (A i (t k )) Then compute C i (A i (t k ))