Probability-Possibility Transformation under Perspective of Random-fuzzy Dual Interpretation of Unknown Uncertainty

—Information-preservation is recognized as the only principle for probability-possibility transformation in this work and the normalized transformation is the right method. This is based on the viewpoint that the reason we can transfer probability and possibility is that we believe the uncertainty being handled can be information-equivalently described by both probability and possibility. That viewpoint is endorsed by the random-fuzzy dual interpretation of unknown uncertainty, which says that unknown uncertainty being estimated could be interpreted as either randomness or fuzziness, depending on the available prior information and the perspective of cognition and modeling. Information of uncertain variable is defined in this work as its distribution. The suggested information-preservation principle is different from Klir’s principle, which is in fact an entropy-preservation principle. Then we investigated the problem of information preservation and propagation in parallel probability-possibility systems. By parallel, we mean the two uncertainty systems have the same priori information. After uncertainty propagation the two parallel systems will generally bifurcate, which means information preservation only holds locally between the two parallel systems. This observation accords with our intuition since probability and possibility use different normalizations as well as different disjunctive operators, which makes them two different uncertainty systems appropriate for randomness and fuzziness, respectively.


as below:
Randomness is the occurrence uncertainty of the either-or outcome of a causal experiment, characterized by the lack of predictability in mutually exclusive outcomes.
Fuzziness is the classification uncertainty of the both-and outcome of a cognition experiment, characterized by the lack of clear boundary between non-exclusive outcomes.
Possibility theory has a structure that is parallel to probability theory and both methods describe resolution of uncertainty by Wei Mei (meiwei@sina.com), Gan Li (ligangopt@sina.com), Yan Xu (hbu_ami@163.com) and Lin Shi (shilin85@foxmail.com) are with the distribution of uncertain variable. Probability and possibility both measure (or score) the value of uncertainty with numbers ranging from 0 to 1, which makes up the primary similarity of probability and possibility. Nevertheless, the two concepts are in nature very different. Since random outcomes are mutual exclusive, probability requires that scores of every possible outcome of a random variable should sum to one, which is the key axiom of "additivity" [13][14][15]. In contrast, possibility is marked by the key axiom of "maxitivity" [1,2,6], which says that the possibility of two disjunctive non-exclusive fuzzy concepts is equal to the maximum of the two constituent possibilities. Possibility theory could be regarded as the foundation for fuzzy sets, since membership function of fuzzy sets could be recognized as likelihood function of possibility [16], and composition of fuzzy relations is equivalent to composition of conditional possibilities [5].
Despite their differences, many people are interested in connecting the two kinds of uncertainty measures of probability and possibility. Some of them try to interpret one uncertainty measure by another [17][18][19][20] whereas some others intend to build up conversion methods for them [21][22][23][24][25]42]. The motivation to study probability-possibility transformation has arisen not only from our desire to comprehend the relationship between the two theories of uncertainty, but also from some practical problems [21,26]. Examples of these problems are: constructing a membership function of a fuzzy set from statistical data, constructing a probability measure from a given possibility measure in the context of decision making or systems modelling, or combining probabilistic and possibilistic information in expert systems [26][27][28][29].
By now various methods of probability-possibility transformation have been suggested in the literature. Except for the normalization requirements, they differ from one another substantially for the various principles that they are based upon [21,23]. Among them, there are basically three transformation methods. The most common method is the normalized transformation, which is based on ratio scale to satisfy the normalization requirements of probability and possibility, respectively. Another method is the arising accumulation transformation [22,24], which stresses that there are informational differences between possibility and probability measures, and transformation will lead to information loss or gain [22]. The third one is Klir's method, which adhibits uncertainty invariance principle, which claims that any transformations that are not invariant with respect to the amount of uncertainty are ill-conceived [21].
This work is to reexamine the problem of probabilitypossibility transformation, but under the perspective of randomfuzzy dual interpretation of unknown uncertainty. For typical tasks of uncertain inference, such as estimation or recognition, the uncertainty handled is in fact unknown uncertainty that could be inferred or estimated by using available prior information and observations [7]. The dual interpretation says that the unknown uncertainty being handled could be interpreted as either randomness or fuzziness, depending on the available prior information and the perspective of cognition and modeling [7]. Though it does exist that in a certain situation one kind of interpretation is more reasonable than another kind of interpretation.
The dual interpretation suggests that the unknown uncertainty being handled could be simultaneously hence should be information-equivalently modeled as probability and possibility. That endorses the reason why we can transfer probability and possibility, which is that we believe uncertainty being handled in form of probability or possibility should have equivalent information. Otherwise, we will be upset with the conversion. In view of this, this work considers information preservation principle as the only principle for probabilitypossibility transformation, which will be elaborated in detail in section 4. Note that information preservation addressed in this work, is different from Klir's information preservation, which is entropy preservation in nature.
Be aware that uncertainty in form of probability or possibility has equivalent information does not mean the two systems of probability and possibility are equivalent. This work will as well investigate the problem of information preservation and propagation in parallel probability/possibility systems, which is section 5 and 6. By parallel, we mean the two uncertainty systems have the same priori conditions in term of equivalent information. We will see the two parallel systems will generally bifurcate after uncertainty propagation, which means information preservation only holds locally between the two parallel systems. The conclusion we have drawn are meaningful for understanding the two uncertainty systems, which are suitable for handling randomness and fuzziness, respectively.
The rest of the paper is organized as follows. Section 2 reviews fundamental aspects of possibility theory. Section 3 reviews three classic methods for probability-possibility transformation. Section 6 concludes the paper.

Random/Fuzzy Variables
Definition 2.1. A random variable X is a variable whose value is subject to variations due to random uncertainty. A random variable can take on a set of possible values in a random sample space , or its generated event space ⊆ 2 .
Definition 2.2. A fuzzy variable X is a variable whose value is subject to variations due to fuzzy uncertainty. A fuzzy variable can take on a set of possible values in a fuzzy sample space , or its generated event space ⊆ 2 .
Remark: A random variable should be modeled by probability and fuzzy variable by possibility [4,6]. Though events in ⊆ 2 and ⊆ 2 are both not mutually exclusive, the structures of random event space ⊆ 2 and fuzzy event space ⊆ 2 are not the same because their corresponding sample spaces and are defined differently [6].

Unknown Uncertainty and Uncertain Variable
Definition 2.3. Unknown uncertainty is the uncertainty carried by an unknown quantity that is assumed to take on N possible outcomes = { 1 , 2 , ……, }.
Definition 2.4. Unknown uncertain variable (in short, uncertain variable) X is a variable whose value is subject to variation due to unknown uncertainty. An uncertain variable can take on a set of possible values from = { 1 , 2 , ……, }, or its generated power set Ξ ⊆ 2 .
Remark: Uncertain variable can be modeled as either random variable or fuzzy variable depending on the available prior information [7].

Possibility and Conditional Possibility
Definition 2.5. Possibility (axiomatic definition) on universal set is defined as a mapping : 2 → [0,1] such that [1,2], By Axiom 2 and 3 we can derive Eq. (1) indicates that at least one of the elements of should be fully possible, i.e. ∃ , such that ( ) = 1. In contrast, probability obeys to the sigma normalization below: Suppose ( ) is the joint possibility distribution of fuzzy variables X and Y, then conditional possibility ( | ) is defined as below [1,2,30] (

Normalized Transformation
The most common normalized transformation is based on the ratio scale as expressed by [21,23]

Arising Accumulation Transformation
The arising accumulation transformation is an asymmetric transformation defined by the equations [22][23][24] = ∑ = (7) The transformation → of (7) is based on the principle of maximum specificity, which aims at finding the most informative possibility distribution. A possibility distribution is said to be most informative if any other solution ′ is such that ≤ ′ . The inverse transformation → of (8) is based upon the principle of insufficient reason, which aims at finding the probability distribution that contains as much uncertainty as possible but that retains the features of possibility distribution [22]. The arising accumulation transformation satisfies two postulates below [22]: 1) Consistency condition: ≤ for all i. Here, the obtained possibility distribution should dominate the probability distribution.
2) Order preservation: ≥ if and only if ≥ . Intuitively, if two worlds are ordered in a given way in P, then should preserve the same order.

Klir's Uncertainty-preservation Transformation
Transformation ⟷ that preserve uncertainty were proposed by Klir [21]. It was shown by Geer and Klir that unique transformation of this kind exists only under log-interval scales as below [21]: where the value of is determined by solving (11) below which expresses the requirement that the amount of uncertainty be preserved when is transformed into P or vice versa. ( ) is the well-known Shannon entropy of probability distribution given by [31] And ( ) and ( ) represent nonspecificity and strife of possibility distribution [21], respectively, given by It is known that Klir's transformations (11) and (12) satisfy the general possibility-probability consistency conditions: ≤ for all i. Note that given = 1, Klir's method will be the same as the normalized transformation.

The Dual Interpretation of Unknown Uncertainty
For the task of uncertain inference, interpretation of the uncertainty carried by the unknown quantity relies on the available prior information and the perspective of cognition and modeling [7]. For radar target recognition, e.g., the uncertainty of the unknown target type can be interpreted as either randomness or fuzziness. Under the perspective of occurrence of objective event, the unknown target type X should take place as an objective event from the type set of that consists of possible but exclusive target types. In such a case, the unknown target type should be interpreted as randomness. Under the perspective of subjective cognition of concept, the unknown target type X could be simultaneously classified into more than one outcome from the type set of that consists of possible but non-exclusive target types, by using the available observation data and through the process of feature extraction. In such a case, the unknown target type should be interpreted as fuzziness. The fuzzy uncertainty arisen here has common in essence with that of the natural fuzzy concept such as Young and Senior, i.e., fuzziness is caused by the overlap of their intensions. Here intension indicates the internal content (abstracted feature) of a concept that constitutes its formal definition [6,43].
As discussed in the introduction, the dual interpretation of unknown uncertainty suggests that the unknown uncertainty being handled could be modeled as probability and possibility, simultaneously as well information-equivalently.

Information of Uncertain Variable
For a clear understanding of information, we have referred to both academic literature [31][32][33][34][35][36] and online dictionary [37,38]. Information is notoriously a polymorphic phenomenon and a polysemantic concept so, as an explicandum, it can be associated with several explanations, depending on the level of abstraction adopted and the cluster of requirements and desiderata orientating a theory [32]. Thus, Shannon [33] and Weaver [34] supported a tripartite analysis of information in terms of 1) technical problems concerning the quantification of information and dealt with by Shannon's theory; 2) semantic problems relating to meaning and truth; and 3) what he called "influential" problems concerning the impact and effectiveness of information on human behavior, which he thought had to play an equally important role [32].
According to [32], the General Definition of (semantic) Information (GDI) has become an operational standard in the field of Information Sciences, which is clearly formulated as a tripartite definition (Fig. 1). Information is a numerical quantity that measures the uncertainty in the outcome of an experiment to be performed [38].
This work agrees that information may manifest itself in different levels of abstraction. For instance, a word can carry the basic level of information, a sentence consisting of some words can carry more and higher level of information, and a paragraph (or even a paper) may contain much more comprehensive information. We can also say there are concept level of information, proposition level of information, and higher level of information. High level of information can be constructed by lower level of information. This work attributes information related to uncertain variable to concept-level information, and will next limit discussions on information of uncertain variable, whose definition is given below. Definition 4.1. Information I of uncertain variable is the resolution of uncertainty, which is determined by the distribution of the uncertain variable X.
For random variable X with distribution ( ), its random information is determined by For fuzzy variable X with distribution ( ) , its fuzzy information is given by Remark: It is straightforward to verify that definition 4.1 of information of uncertain variable satisfies GDI. This definition of information is different from the concept of information entropy ( ) , which has been proposed for measuring information capacity. In communication, entropy represents the average rate at which information is produced by a stochastic source of data [37]. The definition of informationally equivalent is given below.

Definition 4.2.
A random variable and a fuzzy variable are said to be informationally equivalent if their distributions satisfy the normalization requirement of (5) or (6).
Remark: The above definition to informationally-equivalent is natural since normalization of (5) or (6) is the minimum requirement for a transformation between a possibility distribution and a probability distribution. It is safe to say such a normalization of proportional operation will not lead to information loss or gain. This point can be justified by the equivalence of uniform distribution of probability and possibility and the equivalence of the means (expected values) of probability/possibility functions. As we all know uniform distribution indicates innocent for both probability and possibility, and it can be easily verified that uniform distributions of probability and possibility satisfy normalizations of (5) and (6). As for their expected values, let us see the center of gravity (COG) of possibility function, which is defined as [39] COG ≜ ∫ ⋅ ( | ) which is a concept parallel to the expected value of probability function. By using (5), it can be easily derived as in (18) that COG is equal to the mean of probability function. We therefore call COG as defined in (17) the mean (expected value) of possibility function.
Eq. (18) reveals that probability/possibility functions with equivalent information have identical expected value and identical variance, as well. Given Gaussian possibility function as below [11,40] for some mean  ∈ ℝ and for some covariance matrix ∈ ℝ  , its counterpart of probability function ( ; , ) can be obtained by using the transformation of (5) or (6). Remark: This principle is different from Klir's uncertaintypreservation principle, which is in fact an entropy-preservation principle. We conclude that the normalized transformation (5) and (6) is the right method for probability-possibility transformation, which is among all the simplest but graceful one.

A Transformation Example with Dual Interpretation of Unknown Uncertainty
Example 4.1. Let us consider an example of vote experiment similar to that in [8]. Suppose two groups of people, with each group having 100 persons, are called on. After showing a picture of a person, they are asked to answer whether he is strong, thin or between. Obviously, there is uncertainty involved in the answer. We tend to recognize this uncertainty as fuzzy uncertainty and prefer a possibility expression for the measure of the unknown quantity (physique) of being strong, thin or between. However, we will see below that the uncertainty can as well be recognized as random uncertainty only if we carry out the experiment in a different way.
Group 1: Voters are allowed to vote 1~3 choices from the outcomes (strong, thin or between) simultaneously, which means this is a both-and choice. By definition, the uncertainty involved should be recognized as fuzziness and could be measured by possibility. And we got the vote results as in Table  1.
Group 2: Each voter is required to vote only one of the outcomes (strong, thin or between), which means this is an either-or choice. By definition, the uncertainty involved should be recognized as randomness and could by measured by probability. And we got the vote results as in Table 2.
As we can see in Table 1, strong has 100 votes whereas between has 30 votes, and thin has 0 vote. Therefore, the person is voted to be strong with possibility of 1, and to be between with possibility of 0.3. And these possibilities can be transformed into probabilities by using (5) or the method shown in Table 1. As we can see in Table 2, among 100 people, 80 people vote strong, and 20 people vote between and no one votes thin. Therefore, the person is voted to be strong with probability of 0.8, and to be between with probability of 0.2. From this vote experiment, we would like to ask two questions: 1) Can we do a transformation from possibility to probability? 2) If we can, what will be the principle guiding the transformation?
The answer to the first question is we can. We learned from the experiment that the uncertainty involved, which is fuzzy uncertainty in nature, can be recognized as random uncertainty as well only if the experiment is carried out in a different way. Therefore, the possibility for describing fuzziness can be transformed into probability to have a random sight-view on the fuzzy uncertainty. Since we are handling the same uncertainty carried by unknown quantity, the expressions for it before and after transformation should contain equivalent information. Otherwise, we will be upset with the conversion. In view of this, the answer to the second question is that information preservation principle (under the information viewpoint of uncertainty) should be the only principle for probabilitypossibility transformation.

V. INFORMATION PRESERVATION IN PARALLEL PROBABILITY/ POSSIBILITY SYSTEMS
We in this section investigate whether information preservation holds in parallel systems of probability and possibility. By parallel, we mean the two uncertainty systems have the same priori conditions in term of equivalent information.

Information Preservation in Event Space of Parallel
Probability/ Possibility Systems Proposition 5.1. Given informationally equivalent distributions ( ) and ( ) ( ∈ ) in the sample space , information preservation between distributions ( ) and ( ) ( ∈ 2 ) in the event space 2 does not hold.

Information Preservation in 2-Ary Distribution of Parallel Probability/ Possibility Systems
We consider two situations as given in Propositions 5.
where the expression to the right of the second equal-sign comes from the substitute of ( ) and ( | ) by ( ) and ( | ) using (5). Similarly, we can verify by (21) that joint distribution ( ) do not satisfy (6). Proof: Follow the line of (20) and (21), we can verify by (22) and (23) that the derived conditional distributions ( | ) and ( | ) do satisfy (5) and (6), respectively.
People may worry that two inequalities in (24) and (25) may lead ( ) / ( ) happen to be informationally equivalent. This worry can be eliminated by a proof by contradiction. Let us suppose ( ) / ( ) are informationally equivalent, and by (22) and (23) we know ( | ) / ( | ) are informationally equivalent, then according to Proposition 5.2, ( ) / ( ) will not be informationally equivalent, which is contradicted to the prior setting of Proposition 5.3.

VI. INFORMATION PROPAGATION DOWN THROUGH PARALLEL PROBABILITY/ POSSIBILITY INFERENCES
We in this section compare information propagation down through parallel probability/ possibility inferences by two typical mechanisms: probability/ possibility update and combination of stochastic/ fuzzy relations.

Information Propagation in Uncertainty Update
Information propagation in form of probability update takes the form of (26) below, where ( | ) is posteriori probability, ( ) is priori probability, and ( | ) is probability likelihood of . Possibility update, as given below, has a parallel structure to Bayesian update [5,41]  Define ( | ) that is equivalent to ( | ) as follows: Then by putting (28) into (29) and with the offset of the denominator of (28), we have where As we can see from (27) and (30), after uncertainty propagation, information preservation will not hold between the updated probability/possibility.

 ()
As we can see from (35), after uncertainty propagation, information preservation will not hold between the combined statistic relation and the combined fuzzy relation.

VII. CONCLUSION
Under the perspective of the dual interpretation of unknown uncertainty, the normalized transformation is recognized as the right method for probability-possibility transformation, which satisfies the information-preservation principle. Given the same priori information, the two parallel systems of probability and possibility will generally bifurcate after uncertainty propagation. This observation accords with our intuition since probability and possibility use different normalizations as well as different disjunctive operators. As typical applications to target recognition and multiple-model state estimation, possibility inference exhibits significant better performance than the probability inference [4,6]; whereas the possibilistic stochastic filter performs identically as the standard Bayes filter [47], and would reduce to the same form as the Kalman filter in the linear-Gaussian case [48].
Future research will conduct more performance analysis on the difference of the two uncertainty systems of probability and possibility, and explore more applications including estimation & recognition in the framework of possibility theory.