The Shannon Information Capacity of an Arbitrary Radiating Surface: An Electromagnetic Approach

Utilizing a cross-disciplinary approach, we explore Shannon information-theoretic characterizations of the information capacity limits of generic electromagnetic (EM) surfaces intended for possible use in wireless communication links. Our principal task is to first formulate at a general and rigorous level the EM theory of the information that can be extracted from the Maxwellian fields radiated by an arbitrarily shaped continuous surface. This is then followed by a detailed derivation and illustration of practical physics-informed algorithms for computing approximations of the Shannon capacity of surfaces with any given geometry operating in Gaussian channels. Our formalism can address both near- and far-field information capacity scenarios, with a mathematical treatment that includes a complete characterization of the source-field polarization structure, mutual coupling, and interactions.

The Shannon Information Capacity of an Arbitrary Radiating Surface: An Electromagnetic Approach Said Mikki Abstract-Utilizing a cross-disciplinary approach, we explore Shannon information-theoretic characterizations of the information capacity limits of generic electromagnetic (EM) surfaces intended for possible use in wireless communication links. Our principal task is to first formulate at a general and rigorous level the EM theory of the information that can be extracted from the Maxwellian fields radiated by an arbitrarily shaped continuous surface. This is then followed by a detailed derivation and illustration of practical physics-informed algorithms for computing approximations of the Shannon capacity of surfaces with any given geometry operating in Gaussian channels. Our formalism can address both near-and far-field information capacity scenarios, with a mathematical treatment that includes a complete characterization of the source-field polarization structure, mutual coupling, and interactions.

Index Terms-Antenna theory, capacity limits, electromagnetic (EM) theory, information theory.
Vector-vector inner product in R 3 .

A · B
Dyad-vector product in R 3 . A Column array of arbitrary length.

I. INTRODUCTION
T HE subject matter of this article belongs to the electromagnetic (EM) theory of information, a topic that while not totally new as such [1], [2] is currently reemerging into multiple fields [3], [4], [5], [6], [7], [8], [9]. This is an interdisciplinary discourse cutting through both EM and information theory, where there has been an interest in synthesizing selected ideas taken from EM theory, EM engineering, information theory, system theory, communication theory, and signal processing, by coherently integrating them into one or few closely related formalisms in which both information and physical degrees of freedom are subjected to careful evaluation, analysis, and manipulation at the same theoretical and computational technical levels [10]. In the EM theory of information, it appears that there are three major research problems, namely, statistical correlation processes, often manifest at microscopic or local levels [11], [12], [13], [14], [15], [16], [17]; system representations of continuous phenomena, where the objective is to devise exact and rigorous signal processing models of physical fields [18], [19], [20]; and mutual information in wave processes [18], especially characteristics most commonly seen at the global level of point-to-point or end-to-end link capacity [21], for example, using the popular degree-of-freedom approach [5], [22]. Our main focus in this article is on the third major problem in the EM theory of information, i.e., that of the physics-informed approach to the analysis, understanding, and use of capacity concepts. This subject has numerous applications. For example, a physics-based take on information capacity may open the door for new ideas in conventional Shannon theory by introducing applications outside coding and error-correction methodologies [18]. Also, an EM understanding of capacity can lead to a better understanding (and hence design) of the overall communication link by injecting EM knowledge into the signal processing part [23]. Perhaps the most obvious example of such potential is the recent interest in capacity-driven optimization and design of various systems [24], [25], [26], [27], [28].
To present a concrete contribution, we further restrict our attention to a specific type of information transmitting systems, EM surfaces, but we try to maintain as much generality in the definition of such structures as possible. The idea of a radiating or transmitting surface is fundamental in different fields. Indeed, the concept encompasses a diverse range of problems and applications; for example, the following conditions hold: 1) In antenna and scattering theories, with the use of the surface equivalence theorem, it is possible to show that any radiating structure whatsoever can be modeled as a system of surface current distributions [29], [30]. 2) Intelligent transmitting surfaces to be deployed as large-and-complex EM surface structures mounted in dense and dynamic wireless communication environments intended to control and improve the network performance by injecting new signals into the channel for use in information transmission, estimation, or testing. 3) Reconfigurable reflecting EM surfaces, often operated in scattering modes, are deployed in order to control and modify the prorogation characteristics of complex and unpredictable media, especially as envisioned for the forthcoming 6G technology [31], [32], [33], [34].
In such applications (and others not discussed here), a generic transmitting surface S t may be viewed as an artificial active or passive "EM source" assigned the task of contributing information to the communication channel, whether in the form of an individual concrete antenna element, a continuous distribution of transmitting sources, or reconfigurable reflector. The unifying concept here is that the surface is a continuous EM source conformal to a geometrically shaped two-manifold where we find that both EM and geometrical degrees of freedom inextricably coupled together. On the other hand, in information theory, capacity and entropy concepts have been mostly developed for point-to-point communication schemes where the dominant mathematical model is the random variable and the random process [35]. Unfortunately, a continuous EM source requires a random field theory [36] for its proper mathematical treatment [37]. To the best of our knowledge, a complete and fully rigorous stochastic calculus theory of the Maxwellian field suitable for the purpose of communication system analysis has not been constructed yet. 1 To evade this shortcoming, in this article, we propose a computational approach that can be used to approximate fundamental capacity limits of arbitrarily shaped EM surfaces under quite generic statistical scenarios. 1 However, there have been several attempts to study various aspects of the EM problem from the statistical viewpoint, e.g., random media, propagation modes in fluctuating domains, and channel models in wireless networks [38], [39], [40]. Nevertheless, we do not consider most of these theories proper stochastic calculus theories. In the latter case, the very differential and integral operators themselves must be replaced by stochastic generalizations [41], where it turns out that the rules of ordinary calculus in Maxwell's theory may not apply [42].
There are three major theoretical considerations that must be considered while attempting to build a satisfactory EM theory of information for radiating surfaces. We mention here that some of what the author believes is currently the most urgent.
1) The information capacity of a surface would, in general, depend on how the latter's radiated fields are measured. The measurement apparatus is formally analogous to a receiver system in wireless communication systems. 2) The information capacity of a continuous source supported by a surface S t depends on the purely geometrical features of the surface as encoded by its Riemannian structure, i.e., the local metric relations and how they vary from one point on the surface to another. In addition, the attained capacity depends on other physical parameters such as the electrical area (size) of the surface (physical area normalized with respect to the wavelength).
3) The information capacity depends on the type of the radiated field, i.e., whether near or far fields, distance of the observation system to the source, and the radiation fields' rich polarization and wavelength substructures. We believe that these problems and the related considerations have not received sufficient attention in the sprawling literature on EM capacity. In fact, capacity studies tend to be approached from the perspective of rather specific examples and systems, hence sometimes avoiding the most general formulation possible for the sake of concreteness. In our approach, we intentionally try to keep the discussion as general and fundamental as possible while carefully addressing each of the abovementioned physics-based contributions to the purely information-theoretic definition of capacity. Our strategy is based on identifying and isolating a proper physical structure-whether geometrical or EM-that dominates a corresponding information-theoretic aspect of the overall system. However, we use a computational approach where eventually a concrete (still general) algorithm for computing capacity electromagnetically is devised and illustrated with several examples. The proposed method is suitable for integration with standard full-wave EM solvers, especially the method of moment (MoM) [43].
This article is organized as follows. In Section II, we provide a broad outline of the rather general (and very complex) problem of describing the information-theoretic setting of the EM communication system in space-time. The purpose is to motivate the need for our alternative (simpler) strategy, which is to be developed in the following. We begin formulating the latter (computational) approach in Section III, where the key necessary mathematical ideas of the frequency-domain formalism, based on the deployment of a hierarchy of finite point dipole model approximations, is outlined. Formulas for the information capacity are then derived in Section IV for the ubiquitous additive white Gaussian noise (AWGN) model. Extensive numerical examples and analysis are then provided in Section V, 2 followed by conclusion. A series of supporting appendixes is also inserted at the end to complete the presentation found in the main part of this article.

II. FUNDAMENTAL CONSIDERATIONS
Imagine a generic EM transmitting system represented by a radiating surface S t endowed with the structure of a differentiable (smooth) manifold. Without loss of generality, one may take these two-surface to be a perfectly electric conductor (PEC), but the following formulation can be extended to more generic EM boundary conditions. We assume that an information source system is modeled as S = {s i , i ∈ I }, where s i is a random variable whose probability distribution is p i (s) and I is an index set (could be finite, countable, or uncountable). Each random variable s i might be discrete or continuous though we focus on the continuous case in the following. One way of thinking about the signal s i is that it represents the information symbols injected into a "point port" of the continuous surface S t (see the following). Our main objective of this article is to investigate the information capacity of the radiating system S t when connected to (excited by) the information source S.
In the standard setting of classical EM systems, i.e., the regime where macroscopic Maxwell's equations hold [44], information may be injected into the device only through specialized structures called ports [29], [45]. These are waveguide structures capable of confining EM fields across the transverse plane, while energy flows along the waveguide axis [46]. For simplicity, we assume that at each port, only one waveguide field mode (the dominant mode) is excited, which is denoted by E j p (x, t), where x ∈ R 3 is the position, t ∈ R is the time, j ∈ {1, . . . , N p } is the port index, and N p is the number of point ports. The port excitation field is further assumed to be separable into two main factors, a pure time signal controlled by both the information source and the waveguide field mode's spatial profile. The EM excitation at the j th point port can be written as This is the general mathematical model of the information data stream injected into the j th point port. Here, s i n, j is the nth time slot symbol, which is the outcome of the stochastic experiment of drawing from the random variable s i n when applied to the j th point port. The time pulse u(t) carries information with symbol data rate f s = 1/T s . In such a model, the map i n, j : Z × N → I is deployed in order to perform signal multiplexing, so different information symbols s i ∈ S could be transmitted using the same point port if desired. Let F t (x, x ; t, t ) be the time-dependent current Green's function (2-D tensor of rank 2) of the transmitting surface S t [7]. The induced surface current due to the j th point port is given by Since the operator relation linking the excitation field and the induced current is linear [47], [48], then the total current due to all point ports exciting the antenna simultaneously is the direct sum of all currents of the form (2), giving rise to [49] where The total transmitting current (5) will now radiate into the surrounding space, giving rise to EM fields E(x, t) and H(x, t).
For simplicity, we focus only on the electric field, which can be expressed in terms of the surface current distribution through the following Green's function formula [50], [51]: Here, G 0 is the forward electric-field free-space dyadic Green's function (3-D tensor of rank 2), i.e., Green's function of the domain surrounding the antenna where no random scattering objects are assumed to exist [52]. Next, we introduce the surface S o , which is the two-manifold where observations of the antenna's radiated field will be collected. 3 The received or observed field is the collection Clearly, this is an infinite set, so the problem of estimating a field-theoretic information capacity starting from EM data cannot be dealt with directly using the standard Shannon theory since the latter works best with a finite number of random variable [35], [53].
In order to attain a better conceptual grasp of the general structure of the EM theory of information capacity, we try to capture the essentials of the process of information transmission by a continuous surface S t as follows: We would like to analyze the information-theoretic structure of the transformation S → O. One way to do so is by comparing the Shannon information contents of the two systems comprised of the information source S and the observation or receiver apparatus O. The most obvious way to do that is through the concept of mutual information between the two random variables [35]. This naturally leads to the relative capacity concept to be defined in detail in Section IV. However, the problem as formulated above is too complex to deal with in such very general form. In Section III, we propose a relatively simpler model where some of the inessential constraints of the general transformation (7) will be relaxed in order to reduce the mathematical complexity of the analysis.

III. DISCRETE MEMORYLESS FREQUENCY-DOMAIN EM MODEL AND THE APPROXIMATION HEIRARCHY
Historically speaking, several methods have been proposed to deal with the information content of continuous classical, quantum, and random fields. Most often, one may first attempt locating a suitable effective Hilbert space representation of the original spatiotemporal classical radiation problem, so the number of degrees of freedom becomes countable, followed by a suitable pragmatic criterion to be employed for truncating the originally infinite dimension to a finite number [18]. Another approach, which we follow here, relies on already starting the description of the problem by reducing the radiating currents and fields themselves to a finite number of elementary excitations or modes (oscillators), which is the method often used in quantum information theory for instance [54], [55].
In the point source approximation hierarchy approach, the number of "point ports" is increased such that, basically, the entire surface S t is "covered" by excitations. Therefore, our key idea is to replace the infinite number of field-current oscillators in the system S → O by a finite number of oscillators in both sides, namely, N t at S t and N o at S o , effectively replacing (7)  In order to perform the calculations in a practical manner, we further restrict ourselves to narrowband signals, so our capacity results will be frequency dependent. To do so, we first need to impose two fundamental assumptions.
1) The transmitting surface port-to-current system is memoryless-in-time, i.e., the following condition holds: 2) The surrounding medium is memoryless-in-time, i.e., shift-invariant or, equivalently These two assumptions jointly imply that the communication channel from the source S to the field E(x, t)-or the observation set O-is time-invariant. This allows us to use the Fourier transform method and hence work in the frequency domain [49], [56]. Remark 1: It should be noted that in general, the memoryless-in-time current Green's function is not memoryless-in-space. Indeed, except for quite few and rather uninteresting cases where S t obeys global spatial translation symmetry, one cannot in general write [19], [49]. Therefore, in terms of the current excitation problem, most radiating structures, e.g., all antennas, do have spatial memory [19], [20]. This is one of the main reasons why a fully fledged spatiotemporal EM theory of information and signal processing is highly nontrivial [7], [10].
In the frequency domain, all dynamic quantities vary sinusoidally in time as per exp(−iωt), while all fields and currents become complex [29]. Fix a spatiotemporal Fourier mode exp(ik · x − iωt), where k is the wavevector. Then, the dispersion relation of the vacuum is |k| = k, where k = ω/c and c is the speed of light [57]. The dyadic Green's function is given by [52] where The 3-D tensor I is the unit dyad. The radiated field can be expressed in terms of this Green's function using the formula [51] where in what follows we are interested in the exterior region Our objective is to first construct an algorithm for estimating the capacity of a transmitting surface S t with respect to an observation surface S o whose local unit normal vector iŝ N o (x), x ∈ S o . There are N t point sources applied at the transmitting side, where the i th point source is positioned at x i while equipped with two polarization degrees of freedom is the local normal to the surface S t and δ ss is the Kronecker delta function. The following formal characterization of a generic N t -discretization of the continuous surface S t is then introduced: whileα s i , s = 1, 2, are shown in (13). Locally, the current distribution J(x, ω) may be expanded as follows [49], [58]: Therefore, for a discrete N t -term point approximation of a generic continuous transmitting current J(x), we may write where δ is the Dirac delta function (cf. Remark 2). The relation (16) serves as a local expansion of the i th current source on the transmitting surface. Note that J s i (ω) ∈ C is a frequency-dependent complex number, while the polarization vectorsα s i , s = 1, 2, are frequency-independent [19]. Remark 2: The Dirac delta function in (16) is a surface delta function [59]. The physical dimensions of the current source amplitudes J s i are A · m. Each polarization unit vector α s i , i = 1, . . . , N t and s = 1, 2, is a function of the position x i . (For simplicity, this is indicated by the use of only the index i .) Such position dependence is essential for the general case since closed curved surfaces in R 3 cannot always be modeled by a single coordinate chart, and hence, the use of local coordinate systems, here exemplified by the expansion (15) of the current into local components alongα 1 i andα 2 i , becomes mandatory for a correct treatment of the general case (for more details, see [49], [60]).
Substituting (16) into (12), the following formula is obtained: Remark 3: Expression (17) describes the EM field outside S t and is the key EM model that will be used in this article. It is sometimes referred to in the literature as the infinitesimal dipole model (IDM) and has been successfully utilized in various applications, e.g., see [61], [62], [63], [64], [65], [66]. The relation (17) is valid in both the near-and far-field zones [49], [61], [62] and has been verified extensively both computationally and experimentally [65], [67], [68]. A discrete IDM is known to be singular at the surface S t itself because Green's functions are themselves singular at x = x [52]. However, in practical computational applications, pertinent to communications and power transfer, one most often works under the condition x = x (exterior domain scenario), so the radiated fields as such are never singular [49], [61]. To improve the accuracy of (17) for near-field (NF) predictions, one often needs to increase N t [66] or use global optimization [62], [65], [69], [70], [71].
The radiated field is observed at the surface S o , which may or may not fully enclose the radiating surface S t . If N o observations points are used to gather field measurements obtained via idealized localized field probes, then the observation system may be described mathematically as the set Here, at each j = 1, . . . , N o , three perpendicular measurements of the field may be enacted along the three Cartesian directionsβ r j , r = 1, 2, 3. 4 Remark 4: In general, there is no need to consider observation volumes since it is known from EM theory that the field on a closed surface fully characterizes EM radiation everywhere in space [29]. When the observation surface is not closed, it is possible to use EM machine learning to find an IDM like (16) that can predict the fields everywhere in a suitable domain exterior to the transmitting surface S t [61], [71], [72].
The signal measured at the j th observation point x j along the r th direction at frequency ω is denoted by O r j (ω) ∈ C. For ideal perfectly localized point probes, this can be given by 4 In other words, each axis tripletβ r j can be obtained by a local rotation of the global coordinate systemx r , r = 1, 2, 3, around the observation point x j .
where b r j (ω) ∈ C is the observation device's adjustable gain or responsitivity function. Substituting (17) into (19), we finally arrive at for j = 1, . . . , N o and r = 1, 2, 3. The expression (20) fully characterizes the observation made at position x j along the direction labeled by r when the measurement is conducted over a discretized source S t using the apparatus modeled as S o at frequency ω. One of the merits of (20) is that it allows for the directions of observation measurements, labeled by r and j , to vary locally from position x j to position x j , with also different measurement gain (responsitivity) b r j (ω). In this way, the ability to extract information from the transmitting surface's radiated fields is significantly enhanced.
Next, and motivated by the general formula (20), let us define the matrix element H rs i j ∈ C as follows: This is an array of 6N t N o complex numbers that fully characterize EM coupling between the discrete system S t (N t ) and the discrete measurement apparatus S o (N o ). Due to the complexity of formula (21), we may simplify the theory by first isolating those key building blocks giving rise to natural substructures embedded into H rs i j . We start by looking into how the polarization structure of EM radiation determines the character of the coupling interaction between a radiating source centered at x i and observed at x j . To achieve this, we construct the following matrix relation characterizing the complete EM channel from the i th source to the j th observation point: On the other hand, the source vectorJ i possesses two degrees of freedom (surface current source), while the observation array O j is concocted using the radiated field's 3-D data, and hence, it enjoys three degrees of freedom as illustrated by the following array structures: It should be noted that three degrees of freedom are needed in the NF zone in order to describe the polarization of the EM NF, but this number drops to two in the far zone [73], [74]. This remains true even when the radiating current is always taken as a surface (hence 2-D) distribution as per (15) [75]. Therefore, the above multidimensional structure is general enough to deal with both near-and far-field information capacity scenarios. In order to compute the capacity of the entire S t → S o configuration, it is required to assemble all point-to-point interactions of the form (22). First, we build global input and output arrays by a process of concatenating the elementary forms given in (24) as follows: . . .
The complete input-to-output relation of the transmitting surface measurement process can be put into the following matrix form: where the S t → S 0 channel matrix is identified as It will be seen in Section IV that the matrix H in (27) contains an adequate amount of the critical EM data needed for the determination of the information capacity of the generic transmitting surface S t when the standard random channel model of AWGN is assumed. The key to this result is to note that (26) possesses the same mathematical structure of a typical multi-input-multioutput (MIMO) wireless communication system [40], [76], [77]. The details of such analysis will be presented next.

A. Information-Theoretic Background
It can be seen that in our model, information is injected into the generic continuous transmitting surface S t at N t points x i through the transmitting surface current distribution value J(x i ), where each point is associated with two possible independent degrees of freedom labeled by s = 1, 2, giving rise to the arrayJ i [see (24)]. The latter can be discrete, continuous, or mixed random vector. For definiteness and to simplify the presentation, we only consider continuous random variables in this article, but the discrete and mixed cases are similar. Overall, this implies that the information content ofJ (ω), see (25), which is the differential entropy for the continuous random variable case, can be expressed through the formula [53] where N and p(J 1 , . . . , J N ) are the length and pdf ofJ , respectively, while log is the binary logarithm. 5 Standard quantities in information theory, such as the joint entropy H (X, Y ), the conditional entropy H (X|Y ), and mutual information H (X : Y ), may all be defined in a manner analogous to (28) for generic complex random vectors X and Y [40], [53].
Let the mean value ofJ be mJ := E{J }, where E is the expected value operator. The covariance matrix of the input information vectorJ is the N × N matrix where † is the Hermitian operation (complex conjugatation and transpose operations). Note that here, N = 2N t . For the noise part, we consider a complex normal noise vector (circularly symmetric Gaussian) consisting of an N × 1 array n(t) := [n 1 (t) . . . n N (t)] T , where each element's sample is a zero-mean complex normal random variable, i.e., n j (t) ∈ CN (0, σ n j ), j = 1, . . . , 3N o , and all these components are statistically independent [see [40], [78] where m n is the noise mean vector and is assumed zero. The AWGN model can be summarized by the following relation: where n is a sample of the noise vector process introduced above, while O, H , andJ are given in (26). In this model, the noise is also assumed to be independent of the information signalJ . We work in a frequency-domain (passband) regime where the carrier (center) frequency is ω, so all these arrays are generally complex. 6 The bandwidth B := ω/2π around the frequency ω is assumed to be small enough such that the flat-channel approximation can be applied to the noisy MIMO system model (31). Note that there is no loss of generality here since, for wideband channels, one can still use the same model (31) by deploying a suitable orthogonal frequency-division modulation (OFDM) technique [40], [76], [77]. In particular, it was shown recently that specialized EM OFDM schemes can be devised to decouple wideband antenna models where each frequency becomes essentially independent of others [23], [79].

B. Concept of Relative Capacity of a Surface
Throughout the rest of this article, we will be interested in the channel capacity associated with the information transfer process S t → S o , which we define in a way similar to the capacity concept based on reliable communications [35].
Definition 1 (Relative Capacity): If the prior probability of sending a given vector sample of symbols into the N t radiating point portsJ is denoted by p(J ), then we define the information capacity of S t relative to S o by the formula where maximization is performed over all possible vector symbol probability distributions p(J ).

Remark 5:
The rigorous and exact definition of relative capacity is given in Definition 1. It is completely determined by the use of a series of well-defined mathematical objects carefully constructed throughout the previous passages of this article, which includes the radiating surface system S t and the measurement apparatus S t . The relative capacity measures the information capacity of S t relative to S o in the sense that this capacity depends on the geometry of S t and not on the signal processing details of how the surface was excited. This geometry enters the picture through the channel or coupling matrix H in (21). The geometry of S t enters through the local tangential vectorsα s i . These local vectors are equivalent to the local Riemannian metric tensor. On the other hand, the relative capacity does not depend on how you code the input signals nor on what the information sent is and so on. This relative capacity is now a property of the surface S t itself, its radiated fields, and the choice of S o .
Recall that mutual information can be expressed as I The origin of information loss is the presence of the noise process n. Without noise, the capacity is equal to the maximum possible information content of the transmitting surface, which is H (J). Therefore, knowledge of H allows the determination of the mutual information and hence capacity. Since the channel matrix (27) is completely determined by the geometry of the transmitting surface and the medium's Green's function, we can then use EM theory to compute approximations of the information capacity of the process S t → S o as will be shown next. For an AWGN MIMO model such as (31), it has been shown that the capacity per sample is [78] Here, det is the matrix determinant operation. The array 1 N is a unit matrix of dimension N, where in the case of (33), we have N = 3N o .
If the bandwidth of the equivalent baseband channel is B, then the corresponding band-limited signal can be completely recovered with the use of a Nyquist sampling rate of 1/2B [80]. Since we assume that the noise PSD is flat over B, i.e., S n ( f ) = N /2 for f ∈ [ f − B, f + B], where S n ( f ) is the PSD of the noise signal n(t) around the passband (center) frequency f = ω/2π, then the Nyquist samples are uncorrelated [81], which for Gaussian random variables, implies stochastic independence. 7 Therefore, the capacity of the data stream is simply the outcome of the multiplication of the capacity per sample by the data rate, leading to bit/s capacity formula Expression (34) provides the most general form of the information capacity in a Gaussian channel setting.
In the following, we examine two possible fundamental scenarios: the first corresponds to the situation when there is no mutual coupling (MC; EM interaction) between the various point sources; the second is the complementary case in which such interactions are present in the transmitting system.

C. No-MC Capacity Formulas
Here, it is assumed that both the signalJ and the noise n, when modeled as random vectors, are self-uncorrelated, i.e., their respective covariance matrices take the diagonal form This is the most basic scenario in information theory, often referred to in the literature as transmission with no knowledge of the channel state information [76]. 8 Substituting (35) into (34) and using a matrix identity, 9 we arrive at The relation (36) is the main capacity formula of the information transmission system S t → S o when the radiating surface does not possess knowledge of the communication channel. We will, however, work with the following slightly modified version: The derivation of (37) and the definitions of the normalized channel matrix H and the signal-to-noise-ratio ρ can be found in Appendix B. The relation (37) is more convenient for numerical capacity calculations as will be illustrated with several examples in Section V. 7 The PSD level N itself may depend on f , i.e., the carrier frequency, but is still required to be constant over the small passband span 2B [76], [77]. 8 Since the noise is by assumption a white random process, an explicit dependence of the induced narrowband noise on frequency is not needed here [40], [81], but our model allows for possible variation in σ J (ω) with respect to the center frequency ω. 9 Namely, the identity det(1 n + A · B) = det(1 m + B · A) for n × m and m × n matrices A and B, respectively.

D. EM Coupling and Interactions
MC can affect the achievable data rate and other performance measures in communication systems [8], [24], [49], [82], [83]. This subject will be reexamined here from the fundamental perspective introduced by using an EM approach to information theory. A detailed mathematical analysis of the effect of MC on capacity calculations is outlined in Appendix C using the mathematical apparatus developed in Sections II-IV. We give here the following basic definition and theorem followed by a brief theoretical discussion of the role of EM interactions in information capacity.
Proof: See Appendix C. Remark 7: The condition (39) is included in order to consider the less familiar but perfectly legitimate scenario when two signals are applied at the same point x i ∈ S t but with MC taking place between the two locally orthogonal current components labeled by s = 1, 2. Such coupling may happen not necessarily due to electrical conduction but to slight irregularities in the physical layout leading to NF-to-NF coupling between the two current filaments.

E. EM MC and Mutual Information
In all cases (with or without MC), the joint information content of the transmitting surface S t satisfies which is due to the functional relation between the excitation point-port array and the induced current as specified by (58), see Appendix C. Note that this implies H (J|E ex ) = 0, so the mutual information satisfies which is consistent with the fact that in our model noise, the only source of information loss is due to thermal noise added at the very end of the observation process as per (31). Hence, no information loss is experienced as we transition from the point-port signal excitation field array E ex to the actual physical radiating currentJ . Nevertheless, from the basic information theory, we know that joint information is maximal when all the random variables involved are stochastically independent, where in the latter case, entropy becomes the direct sum of the individual processes [35], [53], that is, we have However, the theory presented in Appendix C proves the following theorem. Theorem 2: For an arbitrary spatiotemporal excitation field E ex (x, ω), however, we sample the radiating current distribution, the obtained current samples J i (ω) in (58) are always correlated if there is MC between the current values obtained at the same sample positions.
Corollary 1: MC (as defined by Definition 2) always leads to a signal correlation matrix CJ (ω) that is nondiagonal. Proof: This follows immediately from Theorems 1 and 2 and formula (62) in Appendix C.
It then follows from the above that the information content of the transmitting surface before radiation satisfies the relation where H (J n ; σ J n ) is the entropy computed using a generalization of assumption (35), namely We thus obtain the following corollary. Corollary 2: A no-mutual-coupling scenario yields an upper bound on the essential amount of information that can be delivered by the transmitting surface's source point-port system before radiation. This is one reason why the uncoupled diagonal matrix form (44) is considered particularly important in our formulation. On the other hand, after the onset of EM radiation, we move from the radiating currentJ to the radiated field followed by noisy observation O as per (31). In contrast to the previous E ex →J process, in the processJ → O, a Gaussian nondeterministic channel model, information loss does take place, and the capacity must be computed using the mutual information expression (32). Thus, even while H (J) is reduced with MC due to the onset of electromagnetically mediated port-to-port statistical correlation (Theorem 2), the conditional entropy H (J|O) is also reduced due to the general information-theoretic inequality H (X|Y ) ≤ H (X) valid for any two random variables X and Y [53]. Therefore, mutual information (and consequently capacity) may increase or decrease with EM MC. Nevertheless, in general, we expect a degradation of the system performance in rich random scattering environment when MC is present since it introduces new correlations. 10 Due to the importance of correlations induced by EM MC, an explicit expression of the correlation matrix in the presence of arbitrary MC was derived, see (62).

V. NUMERICAL EXAMPLES AND DISCUSSION
In the following examples, the surrounding medium is assumed to be vacuum with no ground plane or scattering objects. In other words, we focus on idealized but fundamental line-of-sight information transmission scenario. The information capacity of a generic surface, however, is strongly dependent in general on the scattering richness of the propagation medium [11], [76]. In this section, our main interest is investigating the impact of the geometry (size and shape) of the radiating surface and the dependence of capacity on the structure of the radiation field, e.g., as per the impact of the distance of the observation apparatus from the radiating current. 11 For definiteness, the following examples are conducted at f = 2.4 GHz but the algorithm can be used at any frequency. Finally, in all examples, we set b r j (ω) in (19) to unity for all values r = 1, 2, 3, j = 1, . . . , N o . For further details on the validity of the EM dipole model based on (10), see Remark 3 and the references cited therein.
Example 1 (Far-Field Information Capacity of Dual-Polarized Linear Continuous Source System): Consider a 1-D "surface" S t consisting of a line oriented along thê x 3 -direction, as shown in Fig. 2 (left and center). A set of uncoupled N t point sources are arranged along the vertical direction with total spatial extension of L. We here examine two possible orthogonal source polarizations physically implemented as two perpendicular small dipoles (cross dipole antenna) based at x i , with i = 1, . . . , N t . Each point dipole source's degree of freedom is excited at the same frequency f = 2.4 GHz. The capacity results of a dual-polarization λ/2-linear radiator as described above are shown in Fig. 3, which illustrates the convergence behavior of the relative capacity taken with respect to an observation sphere with radius d = 6λ (far-field condition) and N o = 36 equally spaced spherical angles (θ, ϕ) samples. The N t point sources 10 If such additional correlations are accounted for, we anticipate that it is still possible to optimize the system performance even in the presence of MC. An investigation of this rather elaborate design problem is outside the scope of the present work. 11 Since including scattering objects requires extensive side treatment of the physical random scattering models to be used, yet without falling within the stated scope of our article, we relegate scattering effects to future treatments. are uniformly distributed throughout the line extending from −λ/4 to λ/4. It can be seen that the relative capacity C(N t , N o ) saturates with N t → ∞ and fixed N o . In particular, no further significant change in capacity was observed to take place for N t > 70, indicating attaining effective convergence of the continuous capacity approximation hierarchy. This suggests the existence of a limit on the information capacity of a λ/2-continuous source with dual polarization for uncorrelated information point ports (MC-based correlations between the current samples can be always decoupled using proper precoders). In the inset of Fig. 3, we also show the eigenchannels' weights obtained by computing the eigenvalues of the system matrix T := H † H for the N t = 70 approximation hierarchy. We can see that only a few effectively independent pathways for transmitting information are available (about 4 or 5). The number of eigenchannels can be increased (and hence the capacity) by inserting random scatterers [76], [78] or reconfigurable intelligent surfaces [32], [34], [84], [85], [86] in order to modify the channel environment.
The impact of polarization, an EM degree of freedom, is one of the oldest and most well-studied aspects in the EM theory of information [11], [87], [88], [89]. In the following example, we use our method to provide a deterministic approach to estimating how this geometrical-physical parameter may influence the information capacity of a continuous source distribution.

Example 2 (Far-Field Capacity Enhancement in Dual-Polarized Continuous Linear Systems):
The well-known result about the enhancement of capacity in dual-polarized systems can also be established as a limit of continuous sources by carrying out a convergence analysis similar to Example 1 but with one polarization only allowed. Indeed, Fig. 4 shows the capacity results for N t = 75, whose eigenchannels are shown in the inset of Fig. 3, but this time with the horizontal polarization along thex 2 -direction manually set to zero, while the vertical polarization along thex 3 -direction is left unaffected. It can be seen that the dual-polarization scenario exhibits higher capacity than the single-polarization case. This limit is obtained with infinite number of uncoupled point sources covering the full spatial extension of the λ/2-radiator, and hence, the difference in capacity shown there is mainly geometrical in origin. 12 This reduction in capacity in single-polarization systems can be explained by referring to the eigenchannels weights for the single-polarization case shown in the inset of Fig. 4 and compare it with the corresponding distribution in the inset of Fig. 3. It is clear that the former exhibits a smaller number of independent information pathway transmission compared with the two-polarization configuration, hence the reduction in capacity. The importance of these results stems from the fact they are obtained by a fundamental analysis of the capacity limit of arbitrary linear half-wavelength source distribution, and hence, relative to the current S o (N o = 36) observation system, no further improvement is possible for generic AWGN channels (no knowledge of the exact channel model is available) unless a specialized precoder is used at the excitation point ports [76], [77], [81]. In other words, these limits have their origin in the geometric structure of this problem, i.e., the half-wavelength line source system with spherical field observation in the far-field zone.
Example 3 (Linear Source Far-Field Capacity Enhancement With Respect to Size): At a single frequency, even if we set the MC between point sources to zero, the capacity cannot increase with N t unless the physical dimensions of the continuous source are increased. In other words, increasing the density of the transmitting point ports does not improve information capacity even when no MC between the point ports takes place with such increase in density. A demonstration of this is shown in Fig. 5 for the case of the linear source of Example 1, where the far-field capacity is varied with the linear source size L. It is clear that higher capacity is obtained for larger electrical size since we kept the same frequency (hence the wavelength) as before. Moreover, to ensure accurate prediction, the number of sources N t was increased with increasing the radiator size to guarantee that the obtained relative capacity results reasonably approximate continuous source data with respect to the spherical far-field measurement system S o (N o = 36).
Next, we consider the NF analysis of the source system treated in the previous examples. The ability to improve the performance of a communication system by working in 12 However, it should always be recalled that by increasing the number of observation points N o , one may modify the N t → ∞ far-field capacity limit. This is why what we are computing here is really the relative capacity as defined in Section IV-B, i.e., the capacity limit taken with respect to a given observation system S o .  the NF zone has already been investigated from multiple viewpoints [90], [91], [92], [93], [94]. Understanding how the information capacity behaves with the distance from the transmitting surface is then important for several applications.
Example 4 (Linear Source NF Capacity): For L = 0.5λ, the minimum radius of the observation sphere sphere S o is 0.25λ. We show the far-field results corresponding to d = 6λ as well as intermediate and deep NF results in Fig. 6. The number of observation points N o is increased from 36 in the previous examples to 144. It is clear that NF capacities consistently increase with decreasing receiver distance d, suggesting that NF communication systems enjoy higher information capacity than their far-field counterparts. Theoretically speaking, such increase can be accounted for as being, at least in part, due to the availability of more complex field structure in the NF scenario compared with the FF, where in the latter case, the radial component of the field is zero [93]. Moreover, NF processes possess latent but richer (often evanescent) subwavelength components that can be utilized for encoding extra bits of information [73], [74]. The present method then provides a way to systematically quantify how the geometrical and EM design of the communication system can be modified to maximize the utilization of such latent NF capabilities.
Example 5 (Square Patch NF Relative Capacity Convergence Analysis With Respect to N t ): We consider here a square patch with L = W = 0.5λ and point sources distributed in the horizontalx 1 -and verticalx 2 -directions of S t with variable N t . In the most generic case, there would be two surface polarizationsα 1 i =x 1 andα 2 i =x 2 (horizontal and vertical polarizations). The point sources are confined to thê x 1x2 plane. In the following results, we choose the point source locations and the associated two polarization directions at each position randomly (using a uniform random variable distribution.) When N t grows very large, the square patch becomes densely covered with most directions of excitations considered. Our objective here is to investigate the convergence behavior of relative capacity with respect to the fixed observation system S o when the number of source points N t increases, while the physical dimensions of the patch are the same. The spherical observation (receive) system is set in the NF zone with radius d = 1.0λ and N o = 400. The capacity results are shown in Fig. 7, where it can be clearly seen that capacity converges to the geometrical limit of a continuous square patch of this electrical size after around 5000 radiating points.
Example 6 (Square Patch Far-Field Relative Capacity Convergence Analysis With Respect to N o ): For a square patch with L = W = 0.5λ and 5 × 5 point sources uniformly distributed in the horizontal and vertical directions of S t (N t = 25), we construct a spherical observation (receive) system with radius d = 50144λ in order to test the impact of the size of S o on the convergence of the capacity relative to S o . We use only one source polarization withα 2 i =x 3 , while the locations x i of all point sources are confined to thex 1x2 plane (vertical polarization). The capacity results are shown in Fig. 8. For such very large sphere, even though the angular span of the spherical observation angles can be well covered with high resolution using few hundreds angular point samples, the minimum physical distance between nearby point receivers on S o is proportional to dd, which is significant for very large d even when the angle d between two points is very small. This implies that the underlying MIMO system can still improve the information capacity as the results clearly shown in Fig. 8. Due to computer memory limitation, with such massive radius d, it is not possible to simulate an arbitrarily large number of receiver points in order to estimate the convergence of Shannon capacity relative to S o (N o → ∞). With smaller d, one can obtain convergent relative capacity when N t and N o becomes large enough. It should be noted though that while capacity was increasing with N o , the rank of H and the weights of the system eigenchannels converge rapidly. This suggests that the information channel quickly stabilizes with increasing N o and approaches the far-field transmission capabilities of the radiating array under consideration. All extra gains in capacity observed in Fig. 8 with increasing N o are due to the high minimum critical density of receive points needed on a sphere with very large radius d in order to approximate a continuous receiver. Again, these observations can be confirmed by rerunning this example with observation spheres with small radius.
Examples 5 and 6 also suggest taking extra care in calculating the NF capacity using the IDM approach based on the MIMO system capacity. The reason is that it is known that if the number of dipoles is small (between 5 and 10), then the validity of the NF formula is restricted to a distance about λ from the radiating surface [61]. For accurate predictions of the NF capacity, it is required that one takes the N t → ∞ limit before the distance limit. In other words, if C(N t , N o , d) is the capacity of a radiating S t (N t ) measured by spherical S o (N o ) system with radius d, then we note that the following situation holds: N o , d). (45) Stated differently, due to the discrete nature of our original EM model, limit operations should be approached very carefully since the underlying process involves continuous quantities. However, we define the deep NF capacity relative to a shrinking spherical surface with N o observation points by an expression of the form N o , d). (46) In this way, the shortcoming of the dipole model can be avoided since it is known that a large enough number of dipoles can capture very well the structure of the NF even with strong MC [66]. A complete rigorous theory bypassing restrictions such as (45) is outside the scope of the present work. Example 7 (Spherical Far-Field and NF Relative Capacity Performance Analysis With MC): In this example, we study a nonplanner radiating surface, a sphere with radius a, and compute both its far-field and NF capacities using a spherical observation system (see Fig. 1). The far-field capacity of a transmitting spherical surface with a = 0.1λ is shown in Fig. 9, where the observation sphere is located at distance d and we choose N t = N o = 360. In order to illustrate how capacity in such basic system depends on distance, several simulations were conducted with variable d, where the far-field case corresponds to the largest distance d = 20λ. Again, we notice the overall pattern of increasing capacity when the distance of the receiver to the transmitter is reduced. For the case of MC, we assume a real random correlation matrix CJ of the form CJ = σ 2 J A, where A is a 2N t × 2N t random matrix whose entries are uniformly distributed random variables between 0 and 1. We give two examples with MC, one for the far-field (d = 20λ) and the other for the deep NF (d = 0.5λ). The results labeled "with MC" in Fig. 9 correspond to two different instantiations of such MC/source correlation scenario. In these numerical experiments, it is found that the capacity with MC is less than the corresponding capacity without MC.

VI. CONCLUSION
We introduced a general method to define and compute useful information capacity measures for generic radiating surfaces, with possible applications as transmitters or reconfigurable reflecting surfaces that might be deployed in wireless communication systems. Various computational examples were provided, including 1-D and 2-D transmitting systems with various polarization degrees of freedom. The far-field capacity was computed and it was found that the capacity relative to a given observation surface converges with increasing N t . Moreover, we computed the capacity as a function of the distance from the source and found that NF capacity consistently outperforms the far-field capacity. Some examples were given where the stochastic correlation between induced radiating current elements, caused by EM MC, influenced the capacity of a spherical transmitting surface. The proposed method works for both planer and curved structures and can be used in future research on how to design optimized capacity-driven EM communication systems. Moreover, our approach can be integrated with existing full-wave EM CAD tools in order to supply the Shannon information capacity data for familiar antenna array systems used in prototyping and designing current or future wireless communication systems.
The theory developed here and its computational algorithm can be further developed and expanded in future works. For example, it may be interesting to generalize the observation process S o to real-life measurement scenarios by allowing for some correlation between measurement probes and, hence, measurement noise. One may also consider the impact of corners (diffraction effects), scattering clusters, nearby objects, and so on on the communication channel. The detailed optimization results may be provided to design a high-capacity driven communication link exploiting the presence of EM MC and related correlation processes.

APPENDIX A ON MATHEMATICAL NOTATION
This article uses a formalism that combines methods and notations often deployed in two distinct fields, EM theory, and information theory. For that reason, some tensions in our use of various notations may naturally arise. To avoid any confusion, we attempted to explain the meaning of the notation within the main text. For ease of reference, the Nomenclature provides a complete list of all notations utilized in writing this article. In particular, we consistently distinguish between a dyad and a matrix array. In addition, throughout this article, we drop explicit frequency dependence whenever no confusion may arise.

APPENDIX B CHANNEL MATRIX NORMALIZATION AND THE DEFINITION
OF SIGNAL-TO-NOISE RATIO The physical channel matrix H in (31) can be renormalized by rewriting the field observable vector in the following slightly different form: where α ∈ R + is a normalization factor and H is the normalized channel matrix. In terms of this model, the total received power can be computed, yielding where F stands for the Frobenius norm and Tr stands for the matrix trace operation. Thus, the ratio of total measured power relative to the total noise power at the observation surface S o can be estimated as We impose the following normalization conditions: Using (47) in (36) leads to the following spectral efficiency expression: Applying (49) and (50) where (53) is the i th component (point port) excitation field tangential to the surface S t at x i , while we recall thatα s i :=α s (x i ), s = 1, 2. The excitation field point-port array E ex generates the corresponding array of currentsJ defined by (25). Our goal is to investigate the relation between MC and the statistics of these two arrays in light of the information-theoretic framework of the capacity formula (34). We first note that in the frequency domain, the spatiotemporal relation (5) reduces into [19], [95] J(x, ω) = The excitation field corresponding to (16) can be written as where δ S is a surface Dirac delta function as defined in [19] and [60]. Substituting (56) into (54), using (55), we find If we choose to sample the radiating current at the same locations as x i , i = 1, . . . , N t , then the obtained current samples J i := J(x i , ω) may be expressed as where F ss ii (ω) := F ss (x i , x i ; ω), i, i = 1, . . . , N t .
Therefore, the i th current sample is a linear combination of the excitation signals at all other point ports indexed by i . This is the most general form of the N t -point-port approximation of the continuous transmitting surface S t . The radiating current on the transmitting surface possesses a correlation matrix CJ of the form (29). Let the average values of the excitation signals all be zero, i.e., assume E[E ex is ] = 0 for all i and s. It then follows from (58) that E[ J i ] = 0 as well. It is not difficult to see that through suitable matrix partitioning operations, CJ can be put into the following general structure: where In order to find an explicit formula for the matrix entries, we substitute (58) into (29), performing some straightforward manipulations, arriving at where i 1 , i 2 = 1, . . . , N t and s 1 , s 2 = 1, 2. Expression (62) relates the correlation between the radiating current samples to cross correlation phenomena as seen at the point-port excitation field locations. In particular, Theorems 1 and 2 immediately follow from (62). Furthermore, note the nonlocal nature of the relation where the correlation between any pair of current samples depends on cross correlation between the field values samples at all other corresponding excitation field position pairs. Most importantly though, even when the field excitations are uncorrelated, relation (62) shows that the current samples remain correlated as long as the EM coupling Green's functions coefficients, here the array (59), are not negligible.