A Rate Distortion Approach to Goal-Oriented Communication A Rate Distortion Approach to Goal-Oriented Communication

—A variant of a robust description source coding framework motivated by goal-oriented semantic information transmission is studied here. Considering two individual distortion constraints and input and output data that takes values in finite sets, we prove a general result that provides in parametric form the various cases of optimal solutions of this problem. Then, we derive structural properties of the solution when this achieves the best rates. Capitalizing on these results, we examine the structure of the solution for one case study of general binary alphabets under Hamming distortions and solve in closed form a special case. We also solve another general binary alphabet case where a Hamming and an erasure distortion are used, as a means to highlight the importance of selecting the type of the distortion constraint in the problem.

Abstract-A variant of a robust description source coding framework motivated by goal-oriented semantic information transmission is studied here. Considering two individual distortion constraints and input and output data that takes values in finite sets, we prove a general result that provides in parametric form the various cases of optimal solutions of this problem. Then, we derive structural properties of the solution when this achieves the best rates. Capitalizing on these results, we examine the structure of the solution for one case study of general binary alphabets under Hamming distortions and solve in closed form a special case. We also solve another general binary alphabet case where a Hamming and an erasure distortion are used, as a means to highlight the importance of selecting the type of the distortion constraint in the problem.

I. INTRODUCTION
Shannon, in his seminal work [1], has deliberately set aside the semantic aspects of a message and its impact. Nevertheless, in [2], he has indirectly provided a means to study semantic information sources because the coding aspect determined by the probabilistic model of the source is dictated by a distortion constraint imposed in the system. Despite various endeavors [3]- [9], a general, well accepted, theory of semantic information, with tangible applications to communication systems, remains elusive. The quest for such a theory has recently gained new impetus [10]- [12], fueled by the emergence of networks of autonomous agents with advanced sensing, learning, and decision-making capabilities.
In this work, we revisit a lossy compression framework, recently introduced in [13], [14], considering finite alphabet sources, and we study the effect of multiple individual distortion criteria in goal-oriented semantic communication. The objective of this work is twofold. First, we aim at complementing the study of [13], [14], which only considers continuous alphabet sources (i.e., i.i.d Gaussian sources) and mean square error (MSE) distortion criteria. Second, we aims at further emphasizing on the role of the distortion constraints in goaloriented communication by showing cases with outcomes that do not appear through the analysis of [13], [14].
The rate distortion framework introduced in [13], [14] is a combination of the indirect rate distortion problem [15]- [17] and the direct rate distortion problem [2], [18] with two individual distortion criteria. The main results therein include a characterization for i.i.d Gaussian or linear based models and its solution; it reveals that the best way to transmit information is by choosing the maximum achievable rates obtained via the interplay between the direct and the indirect lossy compression problems. The rate distortion framework in [13], [14] can be seen as a generalization of the robust description problem for two individual distortion criteria, which in turn is a special case of the two description coding problem [19]. It should be noted that the rate distortion framework with two individual distortion criteria has been studied throughout the years in many papers under various contexts, see, e.g., [20]- [23]. Another relevant yet different setup is the recently introduced rate-distortion-perception representations, see, e.g., [24], [25] (and the references therein), in which perception quality, measured by some divergence between distributions, is included in addition to the classical distortion criterion. One major difference between rate-distortion perception problems and the setup in [13], [14] and ours is that in the former the characterizations are solved for various examples using separately each distortion constraint, in the latter, one can study from an optimization standpoint the joint behavior of the two distortion penalties.
In this paper, we consider a variation of the robust source coding model [13], [14] that captures goal-oriented semantic attributes and intrinsic representation of information (e.g., features, structural and qualitative properties, embedding). We first derive a general theorem, which gives parametrically the implicit solution of the operational characterization of the problem for arbitrary finite alphabet sources with two individual distortion constraints. This theorem is the basis to create generalized Blahut-Arimoto algorithms to solve the problem in full generality (Theorem 1). Furthermore, we derive structural properties that characterize the best achievable rates of the general problem (Lemma 2). Then, we apply these two general results into two problems (application examples) using specific setups with general binary alphabets and two types of distortion measures, namely Hamming and erasure distortions. For Problem 1 we derive structural properties on the optimal minimizer (test channel) consistent with Lemma 2 and characterize its solution (Theorem 2). We enhance this result by solving in closed form a special case to illustrate the rate distortion surface of the problem (Example 1). For Problem 2, we characterize and solve in closed form the solution (Theorem 3). An interesting observation that stems from Theorem 3 is that depending on the distortion constraint, it is possible to make the system choose which source (i.e, semantic or observation) to transmit. Simply put, in goal-oriented communication, selecting the type of individual distortion measures according to the application/task requirements can significantly affect the remote reconstruction of the semantic message.

II. PROBLEM STATEMENT AND PRELIMINARIES
We consider a memoryless source described by the tuple (x, z) with probability distribution p(x, z) in the product finite alphabet space X ×Z. The semantic information of the source is in x whereas z is the noisy observation at the encoder side. The goal is to study how the distortion penalties can affect goal-oriented communication and source reconstruction using lossy source coding. The problem setup is similar to the one proposed in [13, Section II], [14].
Formally, the system model (without the cost penalties) can be described as follows. As an information source we consider a sequence of n-length independent and identically distributed (i.i.d) random variables (x n , z n ). At the encoder, f n , the system observes Z n and describes it by an index M ∈ {1, 2, . . . , 2 nR }. At the decoder, g n , the message set is mapped into the estimates ( x, z) drawing values from the finite set X × Z. This setup is illustrated in Fig. 1.

Encoder
Decoder p(z n |x n ) x n z n M x̂n ẑn Achievability. We say that the rate distortion triplet

A. Characterization of the operational rates
The characterization of the operational rates for the specific problem is given by the following lemma. Lemma 1. (Characterization) For a given p(x) and p(z|x), the rate distortion function of the setup in Fig. 1 is characterized as follows where (a) demonstrates the functional dependence of the mutual information on {p(z), q( z, x|z)}.
Proof: We omit the proof because it is a combination of the well-known approach widely used to transform the indirect rate distortion function [15]- [17] to a direct rate distortion function formulation and then a re-derivation of the achievability of a special case of the two description source coding problem called robust description [19,Theorem 2]. A sketch of the proof is given in [13,Theorem 1].
In what follows we state some technical remarks related to the optimization problem in (3).
The following properties of (3) can be obtained using standard arguments that stem from classical rate distortion theory, see, e.g., [18].
We now present some obvious bounds that are always true (but not necessarily achievable) for (3): where (R(D s ), R(D o )) represent the standard rate distortion functions obtained via their individual distortion criteria. Note that (b) occurs when we choose a strategy where we decode independently x and z, which is always allowed. This upper bound is tight if p( z, x) = p( z)p( x). On the other hand, (a) is the best achievable rate because (3) cannot be lower than the best rate achieved in either less constrained problem (individual rate distortion problems). We note that the lower bound in (4) appears to be the solution for i.i.d Gaussian sources with individual MSE distortion constraints [13], [14]. We conclude this remark by pointing out that the constrained set in (3) is closed and bounded hence compact (for finite alphabets) and the objective function in (3) is lower semicontinuous with respect to q( z, x|z). As a result, from the extreme value theorem, we know that the infimum is attained by a q * ( z, x|z) and we can formally replace it with minimum in the sequel.

III. MAIN RESULTS
In this section, we present our main results. Before giving our first result, we note that the constrained problem in Lemma 1 can be written as an unconstrained problem via Lagrange duality theorem [26] as follows In view of (5) we can prove the following general result.
Theorem 1. (Optimal parametric solution of (3)) Suppose that p(x) and p(z|x) are given. Then, the following parametric solutions for (3) may appear.
(i) If s 1 < 0 and s 2 < 0, the implicit optimal form of the minimizer that achieves the minimum in (3) is where (s 1 , s 2 ) are the Lagrange multipliers associated with the individual distortion penalties and p * ( z, x) = z q * ( z, x|z)p(z) is the Z × X -marginal of the output process ( z n , x n ). Moreover, the optimal parametric so- where and where D * s is given by (8) and where D * s is given by (9) and (iv) If s 1 = 0 and s 2 = 0, then, R(D * s , D * o ) = 0. Proof: A sketch of the proof is given in Appendix A.
Remark 2. (Generalizations) The derivation of Theorem 1 can be extended into more general system models, which may encapsulate a sequence of remote sources, i.e., (x n 1 , x n 2 , . . . , x n j ) and/or a sequence of the observations, i.e., (z n 1 , z n 2 , . . . , z n i ), i ̸ = j, each with their corresponding individual constraints.
Armed with Theorem 1, we can proceed with constructing a generalization of the Blahut-Arimoto algorithm [27], which can optimally solve the optimization problem in (3) for arbitrary finite alphabet sets and general bounded distortion functions. In this paper we do not pursue that direction. Instead, we first focus on finding general structural properties of the solution when the lower bound in (4) is achievable; then we study some relevant setups as a means to further understand the role of the optimal minimizer in the solution of the problem and the role of the multiple distortions. In other words, we want to gain further insights for the setup in Fig. 1, which in turn will help us to better understand of the role of multiple distortions in goal-oriented communication settings, modeled as in Fig. 1. We prove the following lemma. where where ( Then, it is easy to see from (C1), (C2) that both Markov chains z − x − z, z − z − x should be satisfied concurrently. Since the latter case includes the cases where R(D * o ) ≶ R(D * s ), the result follows.
In what follows, we utilize both Theorem 1 and Lemma 2 to study the case of binary alphabets, i.e., X = Z = X = Z = {0, 1} with Hamming distortions. Problem 1. (Binary alphabets with Hamming distortion) Suppose in the setup of Fig. 1, the remote source x and the noisy channel of z given x are modeled as follows The following result is a key contribution of this paper. It reveals the structural result of Lemma 2 and consequently the characterization of the best achievable rates in (4). Fig. 1 restricted to the given data of Problem 1. Then, the following hold:

Theorem 2. (Solution of Problem 1) Consider the setup in
(i) the structural properties of the solution in Lemma 2 hold; Proof: See Appendix B. The general result of Theorem 2 shows that the lower bound in (4) is achievable for this class of input data under probability of error distortions. As expected, the result in Theorem 2 is also consistent with a similar result obtained (using a different approach) for scalar-valued i.i.d Gaussian processes with individual MSE distortion constraints in [13,Corollary 1]. An interesting endeavor is to study whether the result of Theorem 2 can be extended to larger alphabets of equal or different alphabet size.
In what follows, we analyze the computation of the solution derived in Theorem 2, (18). We first recall that the problem essentially looks for the maximum solution between R(D s ) = min q( x|z) E{ ds(z, x)}≤Ds which corresponds to a direct rate distortion problem with an i.i.d binary source z with probability distribution given by (30) and to an indirect rate distortion problem with an i.i.d binary remote source x and a noisy observation z both described by (17). For the direct rate distortion problem (19) with binary source, it is relatively easy to see that the closed form solution will be a straightforward generalization of the analytical solution derived for instance in [ wherep = p(z = 0) is computed in (30) and H b (·) denotes the binary entropy function. On the other hand, an optimal closed form solution of the binary indirect rate distortion function (20) is not known, in general, and only bounds exist in the literature, see, e.g., [29]. Nevertheless, one can always use straightforward generalizations of the classical Blahut-Arimoto iterative schemes to numerically compute the optimal solution.
Example 1. (Equiprobable semantic source and binary symmetric channel) In the particular case where the semantic remote source is i.i.d Bernoulli( 1 2 ), i.e., p(x = 0) = 1 2 , and the binary channel in (17) is symmetric with crossover probability p(z = 0|x = 1) = 1 − β, β ∈ [0, 1 2 ) 1 , one can easily infer via (21) that H b (p) = 1 bit source/sample and Moreover, for the same input data, it can be shown, see e.g., [18,Exercise 3.8], that Substituting (22), (23) in Theorem 2 we obtain where [·] + = max{0, ·}. A display of the rate distortion surface for β = 0.15 is provided in Fig. 2. Based on (24), we observe an interesting interplay between (β, D s , D o ) regarding the choice of the maximum achievable rates. In particular, it appears that if D o > Ds−β 1−2β , then the system will benefit more by encoding subject to a Hamming distortion only the semantic information, therefore the rate is R(D * s ); whereas if D o < Ds−β 1−2β the system will benefit more by encoding subject to its distortion the observable message of the source with rates R(D * o ). Clearly, if D o = Ds−β 1−2β , then, by encoding either the semantic information or the observations does not offer any advantage for any value of the active distortion region.
Next, we study a special case where the problem simplifies because of the use of mixed distortion constraints (i.e., a standard erasure distortion [28,Exercise 10.7] and a Hamming where X = {0, e, 1}. Based on the given data of Problem 2, we derive the following solution for the characterization of (3).

IV. CONCLUSIONS
A variant of a robust description source coding problem with two individual criteria for finite alphabet messages was studied in this paper. First, we derived a general theorem, which is an essential step to construct generalizations of the Blahut-Arimoto algorithm for this problem. Second, we proved structural properties that have to be satisfied if the best rates are achievable. Finally, we analyzed two relevant scenarios as a means to demonstrate the structural behavior of the solution and that of the distortion penalties in the system model. A key takeaway from our results is that the class of the distortion functions may heavily affects the system behavior irrespective of its task, and hence it should be chosen appropriately.

APPENDIX A PROOF OF THEOREM 1
We give a sketch of the proof due to space limitations. The fully unconstrained problem of (3) using (5) is as follows where s 1 ≤ 0, s 2 ≤ 0 are the Lagrangian multipliers associated with the individual distortion constraints E d s (z, x) ≤ D s and E [d o (z, z)] ≤ D o , respectively, whereas λ(z) ≥ 0 is associated with the equality constraint z, x q( z, x|z) = 1, and µ(z, z, x) ≥ 0 is responsible for the inequality constraint q( z, x|z) ≥ 0.
Leveraging the fact that z, x q * ( z, x|z) = 1, we average both sides with respect to ( z, x) ∈ Z × X and solve to obtain λ * (z) > 0, which is given by By substituting (29) in (28), we obtain the implicit expression of (6) for s 1 ≤ 0, s 2 ≤ 0. Moreover, substituting (6) in (26) we obtain (7) provided that R * (D * s , D * o ) > 0. Clearly, the cases discussed in (i)-(iv) follow as special cases of the previous analysis.

APPENDIX B PROOF OF THEOREM 2
Recall that the input data and the distortion functions are introduced in Problem 1. We first start with some preliminary calculations. In particular, using (17), we can obtain p(z) as follows: which gives Using the fact that p(z, x) = p(z|x)p(x) we obtain Moreover, from (30), (31) and the fact thatd s (z, x) = z∈Z p(z|x)d s (z, z) (from the characterization in Lemma 1), we can obtain d s (z, x) as follows We can now proceed to prove (i).
The explicit structure of (47) reveals that the optimal solution for this problem is parametrized only by the Lagrangian multiplier s 2 < 0, which further means that this solution should be R(D * s , D * o ) = R(D * o ) (from Theorem 1). The latter implies that the Markov chain z − z − x. A way to compute R(D * s , D * o ) is the following. Find s * 2 < 0 from (8) using the explicit expressions (47) and (48). This will give s * 2 = log Then, by substituting all the pieces together in we obtain (21).