Best Fit of Mixture For Distributed Poisson Multi-Bernoulli Mixture Filtering

—The Poisson multi-Bernoulli mixture (PMBM) ﬁlter is extended for distributed implementation using a wireless sensor network. At the core of the proposed networking approach, the PMBM posterior is decomposed into two parts corresponding to the undetected and detected targets, respectively. Fusion is motivated to be performed with regard to the latter only which is represented by MBM based on a distributed ﬂooding algo-rithm for internode communication, which iteratively shares the MBMs between neighbor sensors. Then, a suboptimal “ best-ﬁt-of-mixture ” principle is followed at each local sensor to ﬁnd a MBM that best ﬁts the mixture of MBMs aggregated from distinct sensors, leading to an arithmetic average (AA) of these MBMs. We prove the exact closure of the MBM-AA fusion and discuss its sub-optimality and limitations. Simulation demonstrates the effectiveness and limitations of our approach.

Different from the (C)PHD and MB filters, the Poisson multi-Bernoulli mixture (PMBM) filter [29]- [31] has a closedform filtering recursion (namely closure) based on standard state space models with Poisson birth and has demonstrated better performance [32], [33] when the target detection probability is low. That is, the PMBM is a multi-target conjugate prior. Another relevant conjugate prior in the RFS family is the generalized labelled MB [6], [34], [35]. The PMBM consists of the combination of a Poisson point process (PPP) and a multi-Bernoulli mixture (MBM), where the PPP represents all undetected targets which enables the filter more sensitive to target birth while the MBM considers different track-measurement-association hypotheses gaining a higher accuracy than a single MB. The success of the PMBM filter gives rise to three variants: 1) If the birth model is MB or MBM instead of PPP, the PMBM filter reduces to a MBM filter [31], [36]. 2) If only one global data association hypothesis is maintained in the MBM, the PMBM filter reduces to a PMB filter [32], [37]. 3) By extending to the continuous time multitarget system, the PMBM filter is extended to a continuousdiscrete PMBM filter [38].
Recently, the GA fusion has been exploited for PMB fusion in [39], [40], which fuses the PPP and MB separately and approximately by assuming all targets well spaced. PMBM fusion can be addressed similarly with regard to the PPP and MBM, respectively. However, the MBM-GA fusion does not admit exact closure and so far it is even unclear how to approximate it.
In this work, we investigate the distributed flooding algorithm [41] for internode MBM communication which iteratively exchanges the MBMs between neighbor sensors, resulting in a mixture of MBMs at each sensor. Then, following the "best-fit-of-mixture" fusion principle, a MBM is found that fits the mixture of MBMs with minimum KLD and 2-norm distance, which is exactly given by the AA of these MBMs. We prove this exact closure for MBM-AA fusion. Moreover, for better communication and computation efficiency, only a single or a few MBs in the MBM of each individual sensor are disseminated to the other sensors, while the PPP that is typically of minor intensity and does not admit closure for averaging fusion remains unchanged at individual sensors. This paper is organized as follows. Preliminaries regarding the standard model we consider, MB, and MBM are given in Section II. Optimal multi-sensor fusion, sub-optimality of the AA fusion and the idea of best-fit-of-mixture are analyzed and illustrated in Section III. The exact closure of the MBM-AA fusion and its implementation via MBM-flooding are given in Section IV. Simulation results are given in Section V. We conclude in Section VI.

A. Standard Models
The realization of an RFS of the multitarget states is a set where n = |X| is the random number of targets, and x i ∈ R d is the state vector of the i-th target. The random nature of the multitarget set X is captured by its probability density, denoted by f (X). For any realization where, the cardinality distribution is ρ(n) Pr{|X| = n}. Considering a sensor network composed of sensors s = 1, 2, . . . , S, we denote by S s the set of neighbor sensors of sensor s. We assume that the fields of view of all sensors are identical, in which there are a random, time-varying number of targets. Each sensor operates a PMBM filter [29]- [31] for detecting and tracking these targets. We note that these sensors are typically correlated in the information they have about the targets and in their prior knowledge, in an unknown manner.
Targets arrive at each time according to a non-homogeneous PPP, independent of target survivals. Setting the the cardinality of new-born target RFS as a Poisson distribution with rate λ (namely ρ(n) = e −λ λ n /n!) in (1) yields a multidimensional Poisson distribution [4, p.366], Hereafter, p(x) is a probability density function (PDF) of single target. Each target evolves and is measured by each sensor independent of the other targets. The surviving process of each target is Bernoulli. That is, at time k − 1, the target with state x k−1 will either die with probability 1 − p s k or persists at time k with survival probability p s k and attains a new state x k according to a Markov jump PDF f k|k−1 (x k |x k−1 ).
Given a target with state x k , sensor s either detects it with probability p d s,k and generates a measurement z s,k ∈ Z s,k with likelihood g s,k (z s,k |x k ) or fails to detect it with probability 1 − p d s,k , where Z s,k denotes the set of measurements received at time k by sensor s. The clutter at sensor s follows a Poisson RFS as in (2) with Poisson rate κ s , independent of real measurements of targets.

B. Bernoulli RFSs and Their Union: MB
A Bernoulli RFS can either be empty (with probability 1−r) or have one element (with probability r), distributed over the state space according to PDF p(x). That is, the probability distribution of the Bernoulli RFS X b is To represent the posterior of a random number of (no more than M ) targets, M Bernoulli RFSs X b with respective target existence probabilities r (i) and state PDFs p (i) (·), i = 1, 2, . . . , M , can be used. Their linear union is a MB RFS which is completely characterized by M parameter pairs The corresponding MB distribution can be expressed, for any given cardinality |X mb | = n, as follows where ⊔ stands for disjoint union.

C. Weighted Union of MBs: MBM [37]
The MB can efficiently approximate the posterior multitarget density [42]. This approximation can be improved in accuracy by using a linear combination of multiple MBs, namely a MBM, where different MBs correspond to different global hypotheses of measurements-to-target association history. That is, the MBM RFS is a normalized and weighted sum of multitarget densities of MBs, which can be parameterized by a mixture of MB RFSs where I j is the index set of the BCs in MB j, J is the index set of the MBs in the MBM (each term corresponding to a global hypothesis), and w j ≥ 0 is a coefficient/probability assigned to MB/hypothesis j ∈ J, subject to ∑ j∈J w j = 1.
For any MBM realization where ∝ stands for proportionality. Note that global hypotheses are made up of single hypotheses/BCs each of which corresponding to a potential target [29]. Instead of posing a weight for each global hypothesis, one may weight single hypothesis/BC i in global hypothesis j by w j,i ; see e.g., [31], [36]. Then,

A. Optimal Fusion and Conservative Fusion
Consider two correlated posteriors f (X|Z 1,1:k ), f (X|Z 2,1:k ) at time k. Their joint posterior is given conceptually as follows [8], [43]: where the denominator in (9) is used to divide out the common information between the two fusion sources. Netted sensors which observe the same targets often use the same prior information and model assumptions. Unfortunately, despite favorite, simple cases with a-priori information [44]- [47], it is practically intractable to measure the common information. Then, it becomes important to seek conservative fusion that avoids underestimating the actual squared estimate errors [8], [48]- [50]. That is, for a posterior f s (x) consisting of estimate meanx s ∫ xf s (x)dx and error covariance matrix dx regarding the real state x, it is conservative if and only if P s is no less than the actual mean square error of the estimate, i.e.,

B. Sub-optimality of AA Fusion
As a proven conservative fusion rule [8], the AA fusion of probability distributions f s (X), s = 1, 2, · · · , S is given by Essentially, the AA is a Fréchet mean corresponding to the 2-norm distance based Fréchet function 1 [9], Relatively, the AA fusion also minimizes the average of the KLDs of the fused result with relative to fusing probability distributions f s (X), s = 1, 2, · · · , S, [52] f AA (X) = arg min where the KLD of the probability distribution g(X) relative to f (X) is given as δX. Mathematically, factor 1/S can be dropped from both (11) and (12) without affecting the equation. Then, it becomes more evident that, the probability distribution that best fits the mixture of probability distributions {f s } S s=1 in the sense of minimizing both the 2-norm distance and KLD is the AA of all terms in the mixture. Considering that the mixture of these posteriors {f s } S s=1 contains the complete posterior 1 Fréchet function corresponding to a Fréchet mean like AA may not be unique. For example, the Cauchy-Schwarz divergence reduces a 2-norm distance for PPP [51] and therefore the sum of it serves as another Fréchet function for the AA in the case of PPP RFS; see also [13]. information yielded by all sensors, it is a reasonable substitute of the multi-sensor true posterior. The mixture, however, contains common information of the sensors (as assumed from the beginning) and is no more than an approximate of the true posterior unless the common information are divided out properly. This essentially differs from the fit of the true posterior in e.g., [37], [53]. Therefore, the optimization as in both (11) and (12) is suboptimal.
The principle of best-fit-of-mixture can be illustrated as in Fig. 1. In fact, what has been done with AA-PHD fusion [10]- [14], AA-CPHD fusion [15]- [17], BC-AA fusion [18], MB-AA fusion [9] and RFS-GA fusion [20]- [28] all essentially follow the best-fit-of-mixture principle, aiming to best fit the mixture of unknown-correlated PHDs, CPHDs, BCs and MBs from different sensors, respectively. The key challenge of the fit, however, is from non-closure, for example, the AA of PPPs/MBs is no longer a PPP/MB.

A. PMBM Conjugate Prior
Based on standard state space model assumptions with Poisson target birth, the PMBM conjugate prior at sensor s is given by [29], [37] As shown, the PMBM consists of a PPP component and a MB-M component, which represent the undetected targets (which are hypothesised to exist but have never been detected) and the detected targets, respectively [29]. Details for predicting and updating both PPP and MBM are provided in [29], [31], which are omitted here.
Continuously-missed detections that contribute to the PPP are various at different sensors due to the independent, random nature of the detecting event. Furthermore, the PPP usually has a minor intensity and does not admit closure for averaging. Therefore, there is little need but significant difficulty to fuse the PPPs. Following this line of thinking, we do not perform fusion over the PPPs obtained at different sensors. Only the MBMs are communicated and fused over the sensor network. This also saves communication and fusion calculation.
The proposed multi-sensor PMBM filter can be illustrated in Fig. 2. As to be addressed next, both flooding and AA fusion of the MBMs need no approximation, maintaining an exact MBM structure. Clearly, these approaches are straightforwardly applicable to the MBM filter [31], [36].

B. AA of MBMs
The following Lemma lies at the core of our proposed fusion approach.
their AA given by remains a MBM.
Proof. The proof is straightforwardly given by substituting (14) in (15), which yields a MBM as shown in (16) Lemma 1 indicates that the MBM-AA fusion admits exact closure. Key operations for such a fusion are mixing and reweighting the MBMs as shown in (17) and (18). Indeed, the (weighted) mixtures of multiple Bernoulli RFSs, MB RFSs and MBM RFSs are a MB RFS, MBM RFS and MBM RFS, resepctively, as illustrated in Fig. 3. In addition to MBM that maintains closure for mixture union operation, some other popular mixture distributions such as the Dirac delta mixture (commonly known as particle posterior) and GM also admit exact closure for AA fusion.  We now address how the MBMs can be disseminated in a distributed manner that maintains exact closure. We consider the distributed flooding scheme [41] which is naturally consistent with the mixture union operation as shown in (18) and guarantees exact and efficient convergence.

C. MBM Mixing by Flooding
In distributed flooding [41], each local sensor serves equivalently like a fusion center which mixes the relevant information from the other sensors via iterative neighborhood communication. Re-weighting/scaling is carried out at the end of all communication iterations in each filtering step. For clarity, we explain the MBM-flooding algorithm here with respect to the global hypotheses corresponding to MBM distributions. That is, the flooding algorithm updates the set of global hypotheses of sensor s at iteration t = 0, 1, · · · as follows where J s (t) and U mbm s (t) denote the existing and newly received global hypothesis sets of node s at iteration t = 0, 1, · · · , respectively, J s (0) denotes the initial global hypothesis set at node s and U mbm s (0) ∪ i∈Ss J mbm i . In flooding iteration t = 1, 2, · · · , each local PMBM filter unionizes the new global hypotheses that its neighbors have received at the preceding iteration, i.e., U mbm where A \ B is the set difference of A and B. Let s denote the set of sensors that are at most t hops away from sensor s, including sensor s itself. Once flooding is completed at iteration T , sensor i receives the hypotheses of all sensors T hops away i.e., Convergence of the flooding scheme has been addressed in [41]. When t is larger than the diameter D m of the network, all sensors will have exactly the same information, i.e., ∀s = 1, 2, · · · , S, J s (t ≥ D m ) = J 1:S .

D. Exact MBM-AA Fusion
Given a mixture of MBMs corresponding to J S [T ] s as in (22) of each sensor s, all the (original) posterior information about the detected targets generated by sensors S [t] s are available to sensor s now. Then, following the "best-fit-of-mixture" fusion principle, we need to find a MBM that best fits this mixture of MBMs so that each sensor maintains a PMBM conjugate prior while using the posterior information from sensors S [t] s , for which we have the following key result.

Lemma 2. Given a mixture of MBMs with probability distri
s , the MBM that fits them the best, in the sense of minimizing both the sum of 2-norm distances to these MBM distributions and the sum of KLDs with relative to these MBM distributions, is their AA. That is, Proof. The results (23) and (24) are the special MBM case of (11) and (12), respectively. The constraint (25) is automatically satisfied as proved in Lemma 1.
To perform the AA fusion, these hypotheses in This leads to at sensor s a mixture of re-weighted hypotheses/MBs collected from sensors S wheres(j) is the same as defined in (19). Obviously, if T ≥ D m the resulting MBM distribution is given in (16) otherwise the result is only an approximate. It is important to note that, these hypotheses are not independent but highly cross-correlated in a complicated manner. MBs from the same PMBM filter are more correlated with each other than than those from different PMBM filters. However, AA fusion has the advantage to deal with any degree of crosscorrelation for maintaining conservativeness [8], [9].

E. Localization Accuracy
While mixing the hypotheses from multiple sensors can compensate locally missed detections and significantly "biased" tracks (in the sense that there is a large offset from the target position) due to model mismatching or unknown system input, it does not improve the localization accuracy of any particular BC corresponding to a potential target. Instead, the mixture gains conservativeness and robustness at the price of a larger distribution variance. Just like a trade-off is needed between localization error of the correctly detected targets and missed/false detection in the context of performance evaluation of multi-target trackers [6], [54]- [56], the fusion needs to trade off complete information for accurate estimation.
In realistic implementation, it is our observation that "partial consensus" (i.e., fusing only the key components of the posteriors that are more likely corresponding to targets [9]- [12], [14], [18]) turns out to be very useful albeit simply in reducing both the number of components and the variance of the mixture. More importantly, component merging and pruning and importance sampling have been demonstrated useful for improving the AA result to gain higher maximum a-posterior estimate accuracy [9]- [12], [14], [18]; see also [7]. To do so in the proposed MBM fusion, further fusion needs to be performed between BCs contained in hypotheses of distinct sensors. It is, however, intractable to associate the BCs corresponding to the same target from different hypotheses of distinct sensors and to accordingly merge them to get better localization accuracy. This requires "hypotheses merging" and is the key to improve the localization accuracy. We leave this to our future work.

F. Communication and Computation Consideration
To save communication, one may only communicate a few MBs of higher hypotheses weights w s,j , j ∈ J s in the MBM at each sensor s. This, referred to as partial consensus, has also been proved very necessary for PHD-AA fusion [11], [12], [14] and MB-AA fusion [9] and provides an effective strategy to combat the variance increase side-effect of the AA fusion [7]. For this purpose, there are two alternative types of thresholds. Specify a (maximum) number N g ≥ 1 or (minimum) probability threshold 0 < w g < 1, and only the N g hypotheses with the largest target hypotheses probabilities from each MBM or the hypotheses with probability greater than w g are disseminated and fused.
For computational efficiency, MBM may be approximated by a single MB distribution in a "best-fit-of-mixture" way [37], or simply by selecting a MB in the MBM that has the highest global hypothesis weight [39]. Then, the fusing of MB can be resorted to in the sense of AA fusion [9] or GA fusion [24]- [26], [39]. Obviously, MB is a special case of MBM when there is only one global hypothesis, namely |J| = 1. This avoids the non-closure problem of the MBM-GA fusion but seems unnecessary for the AA fusion which, as we have addressed, can fuse MBMs exactly and directly. So far, it remains unclear how to find a MBM to best approximate the GA of MBMs. In view of this, the AA fusion is advantageous over the GA fusion.

V. SIMULATIONS
We considered a planar space [0, 400] × [0, 400] which was monitored by a sensor network of diameter 4 consisting of 9 sensors as shown in Fig 4. The trajectories of totally five targets were generated using the method given in [29,Sec. VI] using the following target birth and dynamics models: The states of new born targets at time k were a Poisson intensity 0.01 and an inaccurate Gaussian density with mean where ⊗ is the Kronecker product. Each sensor had the same time-constant target detection probability 0.8 and position measurement model as follows with v k, 1 and v k,2 as mutually independent zero-mean Gaussian noise with the same standard deviation of 1m. In addition, clutter intensity is Poisson with rate κ s = 10 and uniformly distributed over the sensing field of each sensor s = 1, 2, · · · , 9. The simulation was performed 100 runs with conditionally independent measurement series for each run.
The PMBM filter implementation used a maximum number of global hypotheses N max = 50. Pruning was required for the maintenance of both PPP and MBM. Weight thresholds 10 −2 and 10 −3 were used for prune the low-weighted hypotheses and Poisson components, respectively. In the latter, BCs whose existence probability lower than 10 −2 were removed. Ellipsoidal gating based on Mahalanobis distance with gate 4 was used for measurement-track association. For estimate extraction from the PMBM, the global hypothesis with highest weight was identified firstly. According to this hypothesis, each BC with existence probability larger than threshold 0.4 were used for extracting an target estimate by taking the mean of the BC, namely estimator 1 addressed in [36].
The filter performance is evaluated by the root mean square (RMS)-based generalized optimal subpattern assignment (GOSPA) error [56], which turns out to be a sum of localization errors for the properly detected targets and a cardinality error for missed and false targets, i.e., d RMS-GOSPA = d Loc + d Card with localization error (Loc-Err) and cardinality error respectively defined as  , y), c) is a metric between x and y cut-off at c. We refer to the cardinality error as misdetection error (MD-Err) and false-alarm error (FA-Err), respectively, depending whether the estimated number of targets is larger or smaller than the truth. We used c = 20, p = 2 in our simulation. Different numbers T of flooding iterations from T = 0 (without applying any information communication between sensors, namely noncooperative mode) to T = 4 (the flooding algorithm is convergent) were applied. Further on, the disseminated number N g of hypotheses of each local sensor was set as 1, 2 and 5, respectively.
The RMS GOSPA error, Loc-Err, FA-Err and MD-Err of the PMBM filter at sensor 1 in the noncooperative mode or the fusion mode using parameters T = 1 and N g = 1 or 5 are given in Fig. 5. The mean errors of all PMBM filters at different sensors versus T are given in Fig. 6, for N g = 1, 2 and 5, respectively. Furthermore, the mean communication cost (i.e., the number of real values be broadcasted by all sensors) and computing time required for each filtering step are given in Fig. 7. The results show: 1) The proposed MBM-AA fusion indeed improves the filters, significantly reducing the GOSPA error as compared with the non-cooperative filter. In particular, by applying flooding only for one iteration (T = 1) and only for disseminating a single hypothesis of each sensor (N g = 1) at each filtering step, the mean RMS GOSPA error can be significantly reduced as much as about 60%. However, RMS-GOSPA/MD-error difference due to using different N g and T is non-obvious. This implies that more hypotheses do not contribute much new information. The RMS GOSPA reduction is mainly due to the reduction of the MD error as clearly shown in Fig. 5: they are almost equivalent and more significant than FA error and localization error. This indicates that misdetection forms a key challenge to the PMBM filter when the target detection probability is low. 2) The localization error and FA error have been increased by the proposed fusion as compared with the noncooperative mode. That is, the fusion does not make the filter get better localization accuracy for confirmed targets but may make the filter more prone to false alarm. Even worse, more flooding iterations cause larger FA and localization errors. The reasons could include: 1) fusion of a small number of (e.g., three) sensors is sufficient to largely compress the misdetection of a local sensor and more sensors do not help more; 2) the proposed MBM-AA fusion do not help improving the accuracy of any BCs including the significantlyweighted one that is used for estimate extraction; see our analysis in Section IV-E. Instead, it will reduce their weights in the fused mixture. 3) Communication cost and computing time required for fusion increase almost linearly with T and N g . Overall, the simulation in this particular scenario demonstrates that it is sufficient to communicate only a single highest-weighted hypothesis among neighbor sensors. The gain is mainly from the reduced MD error, which is a key component of the GOSPA error of the filter with low target detection probability. It turns out to be unnecessary to apply the flooding information sharing for multiple iterations and multiple hypotheses, and for further than neighborhood. These findings are based on MBM-AA fusion without M-B/hypotheses merging and Bernoulli merging/pruning in the fusion. Experimental studies in more complicated scenarios and when proper MB/Bernoulli merging/pruning is applied deserve to be investigated.

VI. CONCLUSION
In this paper, we demonstrate that the linear, arithmetic average (AA) fusion essentially provides a theoretically best fit to the mixture of the fusing distributions which retains the complete information from these sources. The AA of multiple multi-Bernoulli mixtures (MBMs) remains a MBM and so it provides an exact, closed-form fusion solution for MBM fusion. This closure has been advocated for distributed PMBM filter design where the PMBM posterior is decomposed into PPP and MBM and only a part of MBMs are involved in the flooding communication and AA-fusion. Simulation based on a linear system has been given for demonstrating the performance of our approach including its strength and  limitations. The proposed exact MBM-AA fusion approach benefits the filter significantly in combating misdetection but not in improving localization accuracy for which further fusion needs to be performed on the elemental Bernoulli components contained in hypotheses of distinct sensors. This forms our future research direction.