An Improved Expression for Information Quality of Basic Probability Assignment and Its Application in Fault Diagnosis

This work was


I. INTRODUCTION
The world is pervaded with uncertainties, and decisions must be made based on uncertain information. Dempster-Shafer evidence theory is an easy and effective framework for modeling uncertain information [1]- [4]. It is widely used in uncertainty reasoning and it has the ability to process many types of information in the real world to make accurate decision. There are also many other mathematical models to do uncertainty modeling like D numbers [5]- [9], Z numbers [10]- [13], fuzzy sets [14]- [17], intuitionistic fuzzy sets [18]- [21], two-dimensional belief function [22], intuitionistic evidence sets [23], and so on.
Information quality is firstly proposed by Yager and Petry based on Gini entropy to measure the uncertainty for a probability distribution [24], [25]. It has been widely used in pattern classification [26], [27], decision making [28], [29] and so on [30]- [36]. Li et al. propose a generalized expression for information quality in basic probability assignment in Dempster-Shafer evidence theory [37], which makes information quality have greater scope of application.
Yager and Petry's method does not consider the length of each element and treat elements that have different length equally, which is counter-intuitive. Li et al.'s method takes the length of each element into account, and fully considers the potential uncertainty created by non-single elements. However, when elements are of intersection, the previous methods do not take the effects of intersection of statements into account, counter-intuitive results may be obtained. To address this issue, an improved expression for information quality is proposed considering the length of frame of discernment(FOD) and the influence of intersection among statements. An exponential item is added to take the effects of intersection into account. The proposed expression can degenerate into generalized form of information quality and information quality proposed by Yager and Petry under certain conditions. The rest of this paper is organised as follows. Section II introduces the preliminaries include Dempster-Shafer theory, information quality and generalized form of information quality. Improved expression for information quality will be proposed in Section III. In Section IV, numerical examples will be given to illustrate the effectiveness of proposed method. In Section V, an application in fault diagnosis is given to show the effectiveness of proposed method. Finally, conclusions will be made in Section VI.

II. PRELIMINARIES
In this section, the preliminaries will be briefly introduced, including Dempster-Shafer evidence theory and information quality.

A. DEMPSTER-SHAFER EVIDENCE THEORY
Dempster-Shafer evidence theory can represent the uncertainties effectively, and it is widely used in many fields like pattern classification [38]- [40], multi-criteria decision making [41]- [43] and so on [44]- [52]. The brief introduction to Dempster-Shafer evidence theory will be given as follows. Definition 2.1: A set of hypothesis Θ is called the frame of discernment. It is defined as follows: The power set 2 Θ is defined as: where ∅ is an empty set. The counter-intuitive results may be obtained when information is highly conflicted [53]. Hence, many methods are proposed to address this issue [54]- [56]. The weighted average method is proposed by Deng et al. based on Murphy's average method and Jousselme distance [57]- [59]. Definition 2.4: Given two mass functions m 1 and m 2 , the Jousselme distance between m 1 and m 2 is defined as: where − → m 1 and − → m 2 are respective belief functions in the notation of vector(and each size is 2 Θ -1). And D is an 2 Θ × 2 Θ matrix that each of its element is D(s 1 , s 2 ) = s1∩s2 s1∪s2 . s 1 , s 2 ∈ 2 Θ .
The similarity measure between m 1 and m 2 can be defined as Suppose there are n mass functions. And we can construct a 2 n × 2 n similarity matrix as follows: The support degree of one mass function m i (i = 1, 2, ..., n) can be defined as: And the credibility degree of of one mass function m i (i = 1, 2, ..., n) can be denoted as: And we can make the combination by weighted average as follows: where A is a focal element of m.

B. INFORMATION QUALITY
Entropy is a measurement of the uncertainty of information.
The larger the value of entropy, the bigger the uncertainty of information.  [70], generalized belief entropy [71] and so on [72]- [74]. Based on Gini entropy [25], information quality is proposed by Yager and Petry as another way to measure the degree of uncertainty of information [24]. The value of information quality is larger while the uncertainty of information is smaller. Information quality is defined as follows. Definition 2.5: Given a probability function p i , the information quality of p i is defined as: When p ij = 1, the defined information quality reaches its maximum value, and when all p ij = 1 n , which leads to most uncertainty of information, the value of information quality is smallest.
Inspired by the idea of Deng entropy [64], Li et al. propose generalized form of information quality in basic probability assignment [37], and it is defined as follows.
Definition 2.6: Given a basic probability allocation m i , the generalized information quality of m i is defined as follows.
When all the statements are all single elements, the generalized form of information quality will degenerate into the form of information quality proposed by Yager and Petry.

III. PROPOSED METHOD
In this paper, we focus on the disadvantage of generalized form of information quality on the ignorance of the intersection between statements. In this section, we propose a new form of information quality based on generated form of information quality. The proposed form of information quality considering the length of FOD and the intersection between statements is given as follows.
Where |A| denotes the cardinality of proposition A, and |A ∩ B| denotes the cardinality of the intersection between proposition A and B. |X| denotes the length of FOD. When all propositions have no intersection, the proposed form of information quality degenerates into Eq. (11). Then if belief is only assigned to single elements, the proposed form will degenerate into Eq.(10).

IV. NUMERICAL EXAMPLES
In this section, numerical examples will be given to demonstrate the efficiency of the proposed form of information quality. Intuitively, the information quality of m 1 is larger than m 2 because focal elements of m 1 are in intersection. Although the distributions of two mass functions are similar, m 1 has less information volume than m 2 as m 1 has less targets than m 2 . Hence, m 1 contains less information than m 2 and the value of information quality of m 1 is larger than m 2 .
With Yager and Petry's method [24], the information quality of m 1 and m 2 can be calculated as With the generalized form of information quality [37], the information quality of m 1 and m 2 can be calculated as With the proposed method, the information quality can be calculated as Intuitively, the information qualities of m 1 , m 2 and m 3 are not the same. The information quality of m 1 is lowest while the information quality of m 2 is highest as elements in m 2 have largest intersection.
With Yager and Petry's method [24], the information quality can be obtained as follows.  [37], the information quality can be obtained as follows.
With proposed method, the information quality can be calculated as follows.    Table 2 shows the logarithmic value of information quality calculated by proposed method in different cases. When the size of A becomes larger, information qualities calculated by Eq.(10) and Eq.(11) approach to a certain value. Because of the effect of intersection elements, the information quality calculated by the proposed method is larger than it calculated by Eq.(10) and smaller than it calculated by Eq. (11)   of times. It is reasonable that elements have more mutual information with changes of A, which leads to the increment of information quality.

V. APPLICATION
In this section, an application in fault diagnosis is investigated using proposed expression for information quality. The case study in [75] is recalled in this section. Based on method proposed by Yuan et al. [76], a new method using improved expression for information quality is proposed and the main steps of this method are shown in Figure 3.

Start End
Obtain credibility degree and information quality by Eq. (7) and Eq. (12)   In the example in [75], three fault types are called F 1 , F 2 and F 3 . The hypothesis set of faults is Θ = {F 1 , F 2 , F 3 }. Three sensors in the hypothesis set are independent. The results of fault diagnosis are called BOEs, denoted as E 1 , E 2 and E 3 . The BPAs of diagnosis results are shown in Table 3.
With the Dempster's combination rule in Eq (3) To solve this problem, in [76], a fault diagnosis method is proposed. The reliability of each sensor is defined as the weight of a BOE. The weight of each BOE is defined as the product of a static reliability and a dynamic reliability, and it is defined as follows.
Where the static reliability w s (E 1 ) = 1, w s (E 2 ) = 0.2040, w s (E 3 ) = 1. The dynamic reliability is defined with the use of information quality as follows.
In this example, the credibility degree and information quality can be obtained by Eq. (7) and Eq. (12). The values of them are: Crd(m 1 ) = 1 Crd(m 2 ) = 0.5523 The weight of each BOE can be calculated.
The final fused result can be obtained by Eq.(3) and Eq.(9). m(F 1 ) = 0.8996, m(F 2 ) = 0.0685, m(F 2 , F 3 ) = 0.0245, m(Θ) = 0.0074. It is easy to tell and conclude that F 1 is the fault with the highest probability. The results with different methods are shown in Table 4. As Table 4  The application in fault diagnosis shows the effectiveness of the proposed form of information quality. Also, this application indicates a promising application prospect of this new form of information quality.

VI. CONCLUSION
In this paper, an improved expression for information quality is proposed considering the scale of the frame of discernment and the intersection between statements in basic probability of assignment, based on generalized form of information quality. Moreover, some numerical examples are illustrated to show the effectiveness of the proposed method. Results show that proposed method has better performance than the previous methods. In addition, an application in fault diagnosis is used to illustrate the effectiveness of the proposed method. In the future, we will further explore on other applications of this new form of information quality in basic probability assignment.