High-Resolution Parameter Estimation for Wideband Radio Channel Sounding

Multidimensional channel sounding measures the geometrical structure of mobile radio propagation. The parameters of a multipath data model in terms of directions, time-of-flight, and Doppler shift are estimated from observations in frequency, time, and space. A maximum likelihood estimation framework allows joint high resolution in all dimensions. The prerequisite for this is an appropriate parametric data model that represents the multipath propagation correctly. At the same time, a device data model is necessary that typically results from calibration measurements. The used model should be as simple as possible, since its structure has a considerable effect on the estimation effort. For instance, the inherent effort in parameter search is reduced if the influence of the parameters is kept independent. Therefore, the data model is characterized by several approximations. The most important is the “narrowband assumption,” which assumes a low relative bandwidth and also avoids considering any frequency response in magnitude and phase. We extend the well-known multidimensional Richter maximization approach (RIMAX) parameter estimation framework by including proper frequency responses. The advantage reveals itself with high bandwidth in the mmWave and sub-THz range. It allows for a more realistic modeling of antenna arrays, and it breaks with the usual narrowband model and allows a better modeling of mutual coupling and time delay effects. If the interacting object extends over several delay bins (hence, an extended target in radar terminology), we propose a model that assigns a short delay spread and a frequency response to the propagation path that associates it with the respective object. We verify the validity of the device model by numerical experiments on simulated and measured antenna data and compare it with RIMAX. In addition, we use synthetic data based on ray-tracing results and measurements both ranging from 27.0 to $\mathrm {33 \text {G} \text {Hz} }$ with known ground-truth information and show that the proposed estimator delivers better performance for higher relative bandwidths than the conventional RIMAX implementation.


High-Resolution Parameter Estimation for
Wideband Radio Channel Sounding Sebastian Semper , Michael Döbereiner , Christian Steinmetz, Markus Landmann, and Reiner S. Thomä , Life Fellow, IEEE Abstract-Multidimensional channel sounding measures the geometrical structure of mobile radio propagation. The parameters of a multipath data model in terms of directions, time-of-flight, and Doppler shift are estimated from observations in frequency, time, and space. A maximum likelihood estimation framework allows joint high resolution in all dimensions. The prerequisite for this is an appropriate parametric data model that represents the multipath propagation correctly. At the same time, a device data model is necessary that typically results from calibration measurements. The used model should be as simple as possible, since its structure has a considerable effect on the estimation effort. For instance, the inherent effort in parameter search is reduced if the influence of the parameters is kept independent. Therefore, the data model is characterized by several approximations. The most important is the "narrowband assumption," which assumes a low relative bandwidth and also avoids considering any frequency response in magnitude and phase. We extend the well-known multidimensional Richter maximization approach (RIMAX) parameter estimation framework by including proper frequency responses. The advantage reveals itself with high bandwidth in the mmWave and sub-THz range. It allows for a more realistic modeling of antenna arrays, and it breaks with the usual narrowband model and allows a better modeling of mutual coupling and time delay effects. If the interacting object extends over several delay bins (hence, an extended target in radar terminology), we propose a model that assigns a short delay spread and a frequency response to the propagation path that associates it with the respective object. We verify the validity of the device model by numerical experiments on simulated and measured antenna data and compare it with RIMAX. In addition, we use synthetic data based on ray-tracing results and measurements both ranging from 27.0 to 33.0 GHz with known ground-truth information and show that the proposed estimator delivers better performance for higher relative bandwidths than the conventional RIMAX implementation.
Index Terms-Algorithms, antenna arrays, array signal processing, maximum likelihood estimation, parameter estimation. T HE transmission channel is the "heart" of every transmission system. Its statistical behavior describes the theoretically maximum possible data rate that can be transmitted without error. Knowledge of its physical properties is, therefore, of great importance for estimating the achievable performance of such a system and for designing and dimensioning the signal processing methods in the transmitter (TX) and receiver (RX). In the case of a mobile radio channels, these properties are most complicated to describe, since they depend on the multipath effects of electromagnetic wave propagation.
These effects are time-varying, depend on the respective environment, and are accompanied by additional temporal, spatial, and frequency-dependent dispersion. In addition, there is the influence of the antenna characteristics. These cause a directional and frequency-dependent weighting of the transmitted and received waves and, thus, have influence on the estimation of all parameters of the channel. Throughout the progress of mobile radio generations toward 6G, we observe that the system continuously develops not only to mitigate this dispersion in all of these dimensions but also to exploit it in favor of higher performance. This includes the intensive use of the spatial dimension by antenna array processing for more efficient radio access. At the same time, the shift to higher frequencies until millimeter waves and sub-THz unlocks wider and wider bandwidth. This increases the need to study and model effects of wave propagation in these multiple dimensions to the last detail.
The usual procedure to study radio propagation starts with channel sounding. A wideband, real-time multiple input-multiple output (MIMO) channel sounder records the transfer characteristics in the multidimensional aperture domain given by the dimensions frequency bandwidth, observation time, and space at TX and RX. From there, we calculate the distribution of geometrical wave propagation parameters in the discrete multidimensional domain time-of-flight (ToF), Doppler shift, direction of departure (DoD), and direction of arrival (DoA). Although both domains are related by a multidimensional Fourier transform in a straightforward way, it has been shown that a model-based approach combining both domains offers many advantages [1], [2]. It leads to high resolution in the parameter space (beyond Rayleigh) and allows considering physical effects of wave propagation and implementing impairments of the respective sounding 0018-926X © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
device in more detail [3]. Moreover, it bridges the gap toward geometrical-based stochastic channel models, which are used for mobile radio physical layer standardization and performance evaluation [4]. Being model-based, the data analysis procedure of multidimensional sounding starts with the estimation of the model parameters from a sequence of recorded snapshots in the aperture domain. However, the observed aperture volume (in frequency, time, and space) is limited in all dimensions. The frequency dimension is limited by a finite bandwidth because of sounder hardware resources and regulatory issues. The observation time is limited by the variability of the changing propagation geometry. The spatial dimensions are limited by the size and shape of the antenna arrays. Using antenna arrays at TX and RX and estimating the polarization-dependent propagation directions jointly at both sides of the link, i.e., double directional sounding [5], allow to de-embed the sounder from the measured data and, hence, are the prerequisite of measurement device-independent channel descriptions.
In this article, we extend the Richter maximization approach (RIMAX) introduced by Richter [6], a very flexible parameter estimation framework, which is an iterative maximum likelihood parameter search. It builds on a data model consisting of a propagation data model and a device (or instrument) data model. The basic propagation data model describes multipath propagation by a sum of rays (or specular paths), which appear as a multidimensional Dirac distribution in the ToF, Doppler, DoD, and DoA domain. In the aperture domain, this corresponds to a multidimensional complex exponential in the respective dimensions. In addition to the specular paths, the RIMAX data model also contains nonspecular (dense or diffuse) contributions that account for the background noise floor and lead to a better estimate of the specular paths, effectively prewhitening the data according to a parametric model. We call the proposed RIMAX extension Python maximization approach (PyMAX), whose novelty mainly lies in the relaxation of the so-called narrowband assumption for both the data model and the device model.

A. Motivation
In this work, we focus on the signal processing that revolves around parameter-based estimation techniques [7], such as multiple signal classification (MUSIC) [8] and estimation of signal parameters via rotational invariance technique (ESPRIT) [9], [10], or more general methods based on maximum likelihood, such as RIMAX [1], or are solely based on expectation maximization, such as space-alternating generalized expectation-maximization (SAGE) [11].
1) Narrowband Assumption: Already on the first glance, we have an insoluble contradiction: DoD/DoA is usually identified by estimating phase differences at antenna arrays. Strictly speaking, this is only true for sinusoidal signals. On the other hand, ToF estimation requires wideband signals, the wider the better. A similar contradiction appears with respect to Doppler, and it is also a phenomenon defined for a single sinusoid only. We call a signal wideband, if it has a support in frequency domain that contains more than one frequency. Furthermore, the absolute bandwidth is the difference between the highest and lowest frequencies in the signal.
It has been shown [6], [12] that the described contradiction relaxes, if the relative bandwidth, i.e., the absolute bandwidth normalized by the center frequency, is low. In this case, it is approximately valid to assume that some parameters influence the wideband signal as if it were a single sinusoid-the socalled narrowband assumption.
As a consequence, many multidimensional estimation algorithms, such as RD ESPRIT or MUSIC, are straightforward generalizations from their single-dimensional versions. The reason for this straightforward extension is that, algebraically, the data model decomposes to pairwise-independent Kronecker products [6], [9], where each factor corresponds to one parameter and one signal dimension. This algebraic structure tremendously simplifies the resulting estimation algorithms a lot while still enabling joint estimation of the multidimensional path parameters.
2) Toward Higher Relative Bandwidths: However, with increasing relative bandwidths, this assumption is violated to an increasing degree [3]. More specifically, we show, in this publication, that higher bandwidths enforce us to extend the work presented in [1] and [6] in various ways to deal with this more challenging scenario. This more complex data model is inevitable, if we strive for unbiased and physically meaningful estimates of the propagation parameters [3]-if we want to avoid misspecification.
The driving forces behind these extensions have several sources. First, the part of the model that accounts for the measurement device does start to violate the narrowband assumption. Usually, the device data model is determined by a calibration procedure, which measures the response of the device and is then incorporated in the estimation procedure.
Usually, the antenna arrays are the weakest point in terms of calibration measurements. This becomes more severe with wider bandwidth and higher frequencies. Reasons are obvious: antenna elements typically consist of waveguide components or are resonant (e.g., patches), which makes the response frequency-dependent. If these antenna elements are arranged as an array, mutual coupling adds further influence, which results in direction-dependent frequency response distortion. Furthermore, we show that a larger distance between the elements makes it necessary to consider the phase response for each frequency individually and not approximate it at the center frequency by invoking the narrowband assumption. Then, the algebraic separation of DoD/DoA and ToF vanishes, since now the frequency and spatial dimension are connected via these frequency-dependent phase shifts. Put another way, we are confronted with a system identification problem [13], since we wish to account for the polarization-and angle-dependent response, i.e., input-output relation, of the employed antenna elements.
Yet another bandwidth issue relates to the interacting objects (we sometimes refer to these as scatterers) of wave propagation. In the original RIMAX framework, the data model presumes that the response of these objects is not frequency selective. This may no longer apply for higher frequencies and wider both relative and absolute bandwidth. Reason is that the scattering/reflecting objects may extend over several time bins. Then, we may no longer be able to resolve the waves scattered off the structural details (i.e., its elements) of the interacting objects in terms of DoD, DoA, ToF, or Doppler. Therefore, the observed path weights, which are attributed to the same respective extended object, will effectively build up from a sum of not resolved smaller contributions. This results in a frequency selective response of the respective path weight coefficient, which needs to be estimated as well.
This more refined data model is not only harder to handle from a pure computational aspect, but also when it comes to notation and implementation. Second, we need to evaluate how exactly the observation model has to be altered to pose a good trade-off between complexity and descriptive power. Hence, the model design process also has requirements not only in terms of correct specification, but also in terms of feasibility during implementation.
B. State of the Art 1) Vectorized Notation: When reviewing the common literature that deals with channel estimation and algorithms revolving around them [1], [9], [14], one finds that it is a common practice to represent inherently multidimensional data in vectorized form and the involved linear transforms as matrices instead of higher-order tensors, or opposite to this that multidimensional data are treated with tensor algebra. The reasons for this are plentiful. First, it might be known territory from a conceptual point of view, since vectors and matrices are very familiar to all of us. Second, most linear algebra toolboxes, for instance MATLAB [15], have only limited support for higher-dimensional data and their manipulation. Consequently, for increasing complexity of the data models involved in parameter-based estimation, this formalism puts a lot of burden on us both in terms of mathematical notation and implementation in soft-or firmware. Hence, we are in need for a convenient notation both in terms of algorithm adaptability and model generality.
2) Antenna Models: If wideband spatial measurements of a radio environment are carried out by means of antenna arrays, the parametric estimation approach requires an accurate enough and computationally efficient representation of the antenna array's complex, polarimetric radiation characteristic, which is a function of frequency and excitation angle [3]. For the single-frequency case, efficient solutions, such as the effective aperture distribution function (EADF) [16], exist and are well established when developing channel estimation algorithms [1]. The originally single-frequency EADF has also recently been extended to the wideband case [14] by representing the antenna response in delay domain via its impulse response (IR), where it is concentrated on a few time samples. Also, other more physically motivated descriptions have been put forward. For instance, methods based on the vector spherical harmonics (VSH) [17], spherical wave expansion (SWE) [18], or spherical mode expansion (SME) [19] use the structure of the solution space of the differential equations that govern the behavior of radiated waves.
However, for our purposes, we are not so much interested in a physically motivated description but rather more from a signal processing perspective. In addition, it has turned out to be useful [1] that analytic expressions for the derivatives of the beam pattern with respect to the incident angle have to be provided by this description. In this case, the EADF is the method offering the best trade-off between computational complexity and model accuracy. But, realistic extensions of the EADF to the wideband scenario do not exist as yet.
3) Frequency Selective Interactions: Another effect that is observed at increasing relative bandwidths is that the transfer function of interacting objects cannot be considered constant any more [20], [21]. Hence, methods to account for this effect, which also influences the observed transfer function on a perscatterer basis, have been proposed [12], [22], [23]. Based on these observations, it is crucial to attain a method, which is able to account for these frequency-dependent scattering events within the radio channel with as few degrees of freedom as possible.

C. Contributions
In this article, we will mostly extend the data model in terms of wideband (hence, frequency-dependent) modeling of antenna array responses and the path weight, i.e., scattering, response.
In Section I-D3, we put forward to use the Einstein notation for describing the data model for MIMO radio channels. As we will see throughout this manuscript, this notation style allows efficient, concise, and easy to digest formulas that can be implemented almost in an analog manner.
In Section II-C1, we introduce a methodology how wideband antenna measurements can be turned into a compressed analytic description using the EADF. We use suitably employed sinc functions to interpolate the antenna arrays response to arbitrary frequencies by only using a few interpolation points, resulting in a close to optimal representation in terms of memory and computational effort.
In Section II-C2, we infer the frequency-dependent behavior of propagation paths traversing the radio channel, which cannot be accounted for by the common parameters, such as delay, Doppler, DoA, DoD, or polarization.
We will also address suitable multidimensional estimation procedures that help to match the algorithms first in terms of model quality to these more complex scenarios as well as to port them to heterogeneous computing architectures.
D. Notation 1) Basic Symbols: Let N denote the natural numbers {1, 2, . . . }. We denote the real numbers with R and the field of complex numbers with C. Let z denote the complex conjugate of z ∈ C and ℜ{z}, ℑ{z} the real and imaginary parts, respectively. We use ȷ as the imaginary unit. Let E(x) denote the expectation of the random variable x. Let I n×n ∈ R n×n denote the identity tensor, such that I n×n A = A. We use A −1 as the inverse tensor, such that A· A −1 = I and | A| denotes the determinant of A. Let finally f • g denote the concatenation of the functions f and g.
2) Vectors and Matrices: A full MIMO snapshot is naturally an object that consists of multiple distinct measurement dimensions, i.e., frequency, time, and the spatial dimensions both at receive and transmit sides. Conventionally, the literature deals with this specific data structure in two different ways. Either, an MIMO observation is described as a four-way data structure, or it is vectorized, i.e., reshaped, to a vector. The latter case, where a measurement is vectorized, introduces a lot of notational complexity. Say, the data are represented by a vector y, and we access the ith index. It is not directly clear without a four-level indexing arithmetic to figure out, which frequency or time sample the ith index is targeting. Moreover, the formulation of the mapping from parameters to an observation has to resort to various different types of matrix-matrix and matrix-vector products, which hide the essence of the data model behind complex array operations.
Hence, in this publication, we would like to advocate a more convenient method to work with higher-dimensional data and parameter spaces than what is usually used. To this end, we first want to make the distinction among vectors, matrices, and tensors. We are going to denote vectors by small bold faced letters, e.g., x, a, and so forth. However, these objects are not defined to be vectors by their dimensionality, but by their role in equations. More specifically, for a given k tuple of natural numbers n = (n 1 , . . . , n k ) ∈ N k (non-italic bold face), we make the vectors reside in C n = C n 1 ×···×n k . The important consequence is that we can work with multidimensional objects as if they were "regular" vectors. Given any multi-index i ⩽ n (elementwise), we can access the ith entry of a vector a ∈ C n via a i ∈ C. In addition, for two tuples n and m, we can write a ∈ C n×m , which is the Cartesian product of C n and C m . Summarizing, vectors are used for "points" in a vector space of an arbitrary dimension structure.
The concept of matrices or tensors does not have much to do with the form of their representation, but with the role these objects play [24]. Given two vector spaces C n and C m and a linear mapping f : C n → C m (we use a bold-faced f , since it is inherently vector-valued) between them, we can define the tensor F ∈ C m×n associated with this linear mapping, once we have chosen two bases for C n and C m , respectively. Although F is represented as an element in C m×n , its role is to define a linear mapping between two vector spaces. One example we will encounter for such an object is the covariance of a complex vector-valued random variable. We will use capital bold-faced letters for tensors and matrices, such as F, R, and so on. In cases when n and m contain only one element, we call F a matrix.
3) Einstein Notation: In order to conveniently work with these high-dimensional objects, we introduce a multi-index flavor of the so-called Einstein notation [25]. This notation is used in expressions involving one or more vectors and tensors.
The first use of this notation is when forming elementwise products of vectors, i.e., where the two vectors a ∈ C n×κ and b ∈ C κ×m are multiplied elementwise along the dimensions indexed by k to form the vector c ∈ C n×κ×m . In addition, we can also encode summation over certain dimension posterior to multiplication, by means of where the two vectors a ∈ C n×κ and b ∈ C κ×m are multiplied elementwise along the dimensions indexed by i, j, k and then summed over those indexed by k to form the result c ∈ C n×m . Hence, summation happens over those indices that are not present on the left-hand side (LHS) in these equations. As a result, complicated product and summation indication via , becomes superfluous. The advantages of this notation are plentiful. First, we have shown already that it can readily be used for indices that address more than one vector dimension. Second, we can express all the familiar vector and tensor operations much more conveniently in this notation, also for higher-dimensional cases, for instance, the following hold: almost all operations encountered in higher-dimensional linear algebra can be summarized under one unified notational umbrella. The need for hard to decipher vectorization, matrization, and unfolding operations is removed.
Another advantage is the availability of the einsum function 1 in Python, which readily allows the usage of this notation directly in programming code, while being implemented with close to bare-metal performance. The function automatically makes use of single instruction multiple data (SIMD) functionality of modern processors, of optimal contraction paths, and optionally uses multithreading operations provided by the math kernel library (MKL). Moreover, recently, it has also been made available for graphical processing units (GPUs) via the CuPy 2 library. Summarizing, the Einstein notation serves the purpose of a concise mathematical formulation, which can also be easily transferred to an efficient implementation of the invoked vector and tensor products.

II. MODELING OBSERVATIONS
Usually, a sampled version of the transfer function in complex baseband of an MIMO radio channel is represented by a 4-D vector h ∈ C N tx ×N rx ×N f ×N t , where N tx denotes the number of transmit antennas, N rx the number of receive antennas, N f the number of frequency samples, and N t the number of time samples. For our purposes, we define n mimo = In order to describe the behavior of the radio channel in physically relevant quantities, one defines a parametric model via a function s : T → C n mimo and a model for the noise distribution P n (δ), where T denotes a set for the deterministic parameters with vector space structure, and also, S is a suitable parameter space with δ ∈ S for the noise distribution P n (δ). Parameter estimation then hopes that a noisy observation y ∈ C n mimo of h described via constitutes a valid model for y. Valid, in the sense that there is no model misspecification [26]. This means that there exist parameters θ 0 ∈ T and δ 0 ∈ S, such that the residual y − s(θ 0 ) follows the distribution P n (δ 0 ). We wish to develop an expression of s, such that this correct specification is a reasonable assumption and that we can solve the inverse problem of estimating θ 0 and δ 0 from the observation y. In our case, the parameter space T should encode the model order, i.e., the number of specular propagation paths, and the structural parameters of the individual paths, i.e., delay, Doppler, DoD, DoA, and polarimetric path weights. The parameter space S should account for measurement noise as well as the parameters of the stochastic components of the channel; see Section III-B.

A. Basic Linear Data Model
At first, we want to introduce a very basic property of s, which is based upon the linear-superposition principle. The transfer function of the radio channel stems from the solution of a system of partial differential equations. Hence, the propagation behavior of multiple waves is the superposition of the individual waves. We also assume that the so-called specular propagation paths can be modeled as plane waves leaving the TX and impinging at the RX.
This means the most general structure relevant for our considerations of the parametric model s is given by where we refer to γ as the linear path weights. Note that here · denotes a yet to specify vector-vector product by means of specifying the indices n s , n γ , and n a . The so-called atomic function a : R q → C m together with a single γ ∈ C ℓ denotes a model for the transfer function of a single propagation path. Also, note that now the parameters of interest are and which results in a parametric data model s : C ℓ P × R q P → C n mimo , which we are going to specify further in the following. Hence, the parameter space for θ is now T = C ℓ P × R q P to model a multipath scenario containing P specular propagation paths.

B. Narrowband Data and Device Model
Before we continue with the derivation of the extended data and propagation models we are going to consider, we want to use the currently available Einstein formalism to present the model used in [1], which is based upon the formulation of the polarimetric double directional radio channel presented in [5]. All the quantities that are subject to change from the narrowband description to the wideband case are denoted with a˜.
Letã tx define the polarimetric narrowband response of the transmit and receive antennas, respectively, which depends on the angle of the impinging wave. Furthermore, define the functions a f : to model the frequency response and time response of a narrowband plane wave being exposed to a shift in delay τ ∈ R and a Doppler shift α ∈ R, which was measured at frequencies f ∈ R N f and time instances t ∈ R N t . Based on this, a single path parameter is composed of denotes the DoD of the pth path, (ϑ rx p , ϕ rx p ) denotes the DoA of the pth path, and τ p and α p denote its ToF and Doppler shift, respectively. Finally,γ p ∈ C 2×2 constitutes the polarimetric mixing matrix of the pth path. Then, the parametric models for a complete observation can be written as follows: The algorithmic framework presented in [1] makes repeated use of the outer-product structure ofs, which is reflected in the fact that each parameter only influences a single and unique data dimension. This allows substantial algorithmic optimizations resulting in a very efficient implementation of RIMAX when using this parametric model to estimate γ and η.
In order to efficiently calculateã tx orã rx , the implementation of [1] makes use of the so-called EADF, which represents the polarimetric beam pattern of the transmit and receive antennas viã where v ∈ C L ϑ ×L ϕ ×2×N tx can be determined from discrete measurement data in an anechoic chamber. The vector-valued functions d ele : R → C L ϑ and d azi : R → C L ϕ in above equation are defined as follows: where L ϑ,ϕ = 2N e,a + 1. This way, we have obtained a characterization of a single measurement of the MIMO transfer function, which we can also describe with a concise and clean notation. This is also especially helpful if we need to calculate Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
analytic expressions for the derivative of the models with respect to its parameters as used in Section III.

C. Proposed Wideband Data and Device Model
The goal of this section is to use the narrowband model s and its building blocks from (5) as a baseline and present step-by-step extensions, such that it is applicable in scenarios with higher relative bandwidth.
1) Wideband Antenna Responses: We would like to start with the refinement of the models for the antenna responses a tx and a rx . For the time being, we neglect the distinction between tx and rx, since due to Lorentz reciprocity, the derivation holds for both the receive and transmit modes of the antenna array, and we only treat the tx case.
When it comes to deriving an analytic model for the wideband antenna response, one faces two problems. First, the calibration measurement in an anechoic chamber is carried out for the antenna, where angular and frequency samples of the antenna response are obtained at a discrete set of points. However, for well-performing estimation algorithms, we are in need for a continuous description both in angle and frequency. Hence, we are confronted with an interpolation task. Second, the acquired measurement data usually are redundant and contaminated by measurement noise. Hence, a reduced and denoised representation of this data should be derived for use during the estimation procedure.
Say, we have three measurement grids . . , N f }, and we obtained polarimetric samples of the antenna response on this grid, such that we have where p 1 ∈ {1, 2} denotes one of the two orthogonal excitation polarizations and t x = 1, . . . , N tx indexes the array elements. Keeping a spherical angle (ϑ i , ϕ j ) fixed for a single polarization p 1 and element t x , we can consider the antenna's transfer function as a complex-valued function An example is given in Section IV-A. Most practically relevant antennas are designed in such a way that they minimize the distortion of the signal being fed into them. One consequence of this is that the extent of the corresponding angle and polarization-dependent IR in time domain is restricted to a short-time interval; i.e., it has a finite but also small support set. This constitutes the assumption that the antenna acts as a finite impulse response (FIR) filter in time domain. Hence, the transfer function is a slowly varying, i.e., smooth, function. This means that sinc interpolation of the transfer function is possible with only a few degrees of freedom. Essentially, we choose a set of orthogonal basis functions [13] to represent the input-output relation of the antenna elements. Let sinc(x) = sin(π x)/(π x) and say the necessary number of degrees of freedom is bounded by Q ≪ N f ; then, we can represent u i, j,k, p 1 ,t x via However, choosing the sinc functions to be centered only on the frequencies f 1 · q/( f 2 − f 1 ) + (q · ( f 2 − f 1 ))/Q introduces so-called Gibbs ringing at the edges of the frequency band; see Section IV-A. This ringing would be negligible, if the transfer functions a p 1 ,t x (ϑ i , ϕ j , ·) and all their derivatives were periodic with period f 2 − f 1 . In practice, this is rarely satisfied, and we should not make this assumption. The effective time-aperture distribution function (ETADF) [14] artificially periodifies the sampled data, but only up to the zeroth order in terms of derivatives. Instead, we add sinc functions with centers outside the acquired frequency band by means of for some ω ⩾ 0. Relating to [13], this makes the chosen basis for systems identification nonorthogonal in order to reduce the reconstruction error by increasing the estimation variance. In order to retrieve the weights v i, j,q, p 1 ,t x , we solve the following least-squares problem: For v, this constitutes the best linear unbiased estimator (BLUE) in case of white Gaussian measurement noise. After obtaining the reduced set of coefficients v ∈ C N ϑ ×N ϕ ×Q+2ω+1×2×N tx , we apply the conventional singlefrequency EADF interpolation scheme for each individual q, p 1 , and t x separately to obtain Here, we can interpret b as the angular-dependent sincinterpolation coefficients. So, instead of applying the EADF to the beam patterns for each frequency in f , we apply it to the coefficients v i, j,q, p 1 ,t x of the shifted sinc functions ξ q , which, in case of IRs of the antenna with small and finite support, are substantially fewer than the number of measured frequencies.
By doing so, we have reduced the number of necessary angular interpolation calls by a factor of N f /(Q + 2ω + 1). It remains to be shown that the angular aperture of the coefficients v i, j,q, p 1 ,t x is actually limited. This is treated in the Appendix. The resulting analytic descriptions of the individual sinccoefficients' EADFs are then used to gain a completely analytic expression for the complex polarimetric beam pattern via where f denotes required frequencies and f f denotes the f th entry in f . This model function can now be readily used to replaceã tx andã rx in (5) to attain a data model that is able to account for frequency-dependent antenna responses. In summary, the proposed methodology allows to encode the complete wideband far-field beam pattern in the tensor g ∈ C L ϑ ×L ϕ ×Q+2ω+1×2×N tx in order to derive an analytical expression for the antenna's behavior.
2) Dispersive Targets: As a last step for the data model, we wish to account for propagation paths, which have an extent in time domain. Similar to the beam pattern for an antenna, we assume that this extent is confined to a small interval in time around the delay τ . Consequently, we impose the FIR assumption another time, now on the response of individual targets. Hence, again, we model the transfer functions of the reflections at the objects so smooth that only a few degrees of freedom suffice to describe their behavior.
In terms of our model, this implies that the atoms a in (2) have to be superposed depending on the measured frequency. This can be achieved by defining a β ∈ C N f ×2×2×P , and due to the bandlimitation, i.e., the concentration in time domain, we can express this via where In this case, s ranges from 1 − σ to S + σ for natural numbers σ ⩾ 0 and S > 0. Similarly, as in Section II-C1, the overlap is denoted by σ , and the dispersiveness of the scatterers is modeled by S.
If we now start with a model with frequency-flat scatterers reading as follows: s =γ n γ , p · a(η) we use β from (15) to extend it to account for frequency-dependent scattering via s = γ s,n γ , p · ζ s f f · a(η).
With this parametric model at hand, we have the possibility to describe individual transfer functions of scatterers only by means of the array β and the functions ζ s . However, to be more consistent in notation with the data model corresponding to the narrowband assumption, we are going to rename β to γ from here on.
3) Complete Wideband Data Model: We are now in the situation to combine the previous extensions into a wideband data model, which is a direct extension of (5). This can be accomplished by replacing the various parts of the model by their newly introduced wideband counterparts as follows: This relates to (2) in the following way. In the most general form of the model, we have ℓ = [S + 2σ , 2, 2] (hence, ℓ P = [S + 2σ , 2, 2, P]) and q P = [6, P] consisting of six parameters, namely, ToF, DoD, DoA, and Doppler. Moreover, due to It should be noted that depending on the scenario and the measurement equipment, one can also make use of any intermediate version between (5) and (16) in order to selectively account for certain effects while neglecting others. Hence, for a whole polarimetric MIMO snapshot described by P propagation paths, we have T = C (S+2σ )×2×2×P × R 6×P . Also, it is straightforward to adapt (16) to subsets of measurement dimensions accordingly. For instance, if only frequency and time samples are acquired, the model reduces to where only delay and Doppler parameters together with the parameters dispersive targets are estimated. This increase in model complexity in favor of more expressiveness would be of no use, if we could not estimate the involved parameters efficiently and reliably. Hence, the next section deals with the task of parameters estimation in this new setting.

III. ESTIMATION PROCEDURE
In this section, the estimation procedure we employ to extract the parameters of interest from an observation y ∈ C n mimo is outlined. The general idea is to successively increase the number of estimated propagation paths P ∈ N as long as the reliability of the estimate can be deemed sufficient given the currently inferred noise statistics. While incrementing P, the path parameters as well as the noise parameters are updated according to the maximum likelihood (ML) principle ultimately using a gradient iteration. This way, one ensures that for a given model complexity P, we have found an approximately optimal (read: high resolution) set of parameters in the sense of the likelihood. Hence, one can call the proposed algorithm an iterative maximum likelihood estimator.
As such, we need means of carrying out several distinct computations. First, we need an algorithm that is capable of adding a new propagation path to the model, i.e., a global search strategy. Second, we need an algorithm to jointly refine the parameters currently treated by the estimator. Third, the parameters of the stochastic part, composed of measurement noise and nonresolvable paths with the given apertures and bandwidth, of the channel, i.e., the distribution P n,δ , have to be determined, also given the current stage in the estimation of the specular propagation paths. Finally, a methodology to determine the reliability of the estimated paths is needed, such that spurious paths can be detected and removed from the estimate. A very high-level picture of the components and their interaction is given in Fig. 1.

A. Maximum Likelihood Estimation
Due to the random nature of an observation y, we need a suitable statistical model for its distribution. We are going to assume that the noise n is complex zero-mean Gaussian noise with covariance R = R(δ 0 ), which we assume to be known or at least be provided during the estimation routine, but it is out of the scope of this publication to treat the algorithmic details how to attain δ 0 . We shortly discuss its influence on the whole framework in Section III-B.
If we have knowledge of R, we can calculate the so-called negative log-likelihoodλ : C ℓ P × R q P → R of our model parameters given an observation y viã λ(γ , η| y) = ln(|R|) + r n 1 (γ , η) · R −1 n 1 ,n 2 · r n 2 (γ , η) * where we defined the residual as follows: Since the first summand inλ does not depend on the parameters of the specular propagation paths, we can also consider λ(γ , η| y) = r n 1 (γ , η) · R −1 n 1 ,n 2 · r n 2 (γ , η) * (20) to formulate the maximum likelihood problem as follows: This approximates a solution to the inverse problem of finding parameters (γ ′ , η ′ ), i.e., minimizers of (21), that are based on an observation y, the assumed data model s, and the statistical model in the form of the covariance R considered the most likely. From an optimization perspective, (21) constitutes a generalized nonlinear least-squares problem. As such, it is well suited for smooth optimization techniques given sufficiently good initialization. Hence, in Sections III-C and III-D, we provide a blueprint for a such a smooth iteration scheme as well as a heuristic for the required initialization close to a local minimum of λ.

B. Estimation of the Covariance
The estimation of the covariance R via a suitable parametric model R(δ 0 ) serves the main purpose to account for the received power that cannot be described well by the discrete model of plane waves used for the specular components (SCs).
Its parametrization is adapted to the specific characteristics of power distribution of a radio channel in delay domain in the form of the dense multipath component (DMC) [27], [28]. They are still under active research, both in terms of estimation of their angular component [28]. Recently, the DMC has been extended to a multimodal model [29], which becomes increasingly important at higher bandwidths.
We reuse the original parametric model by RIMAX for R, which describes the covariance of a Gaussian noise process of independent samples in frequency domain, where the noise power across these bins follows an infinite impulse response (IIR) filter combined with a parameter accounting for constant measurement across frequency domain. This results in δ ∈ R 4 as the parameter for R. More specifically, we assume R f : R 4 → C N f ×N f to be a Toeplitz matrix-valued function defined as follows: where I N f ∈ C N f ×N f is the identity matrix in C N f . Finally, we get the complete covariance R ∈ C n mimo ×n mimo via Usually, after appropriate initialization, its parameters are refined using a Newton iteration similar to the one in (26), where the objective is again the negative log-likelihood λ, but the parameters are now resided in δ. During this process, the parameters (γ , η) of the SC are kept fixed. Hence, the DMC usually operates on a likelihood for fixed residual r(γ , η). Consequently, the work in [30] makes the case that the DMC not only accounts for diffuse scattering, i.e., the quasi-random superposition of weak specular paths, but also modeling and estimation errors. In this sense, the DMC and SC can theoretically account individually for the same part of the channel IR. However, usually, one aims at representing as much power as possible by the specular path model, since these are the parameters of interest.

C. Initializing the Iterative ML Estimator
Given the fact that, ultimately, we wish to employ a local optimization routine for a high-resolution estimate of the propagation parameters, we are in need of proper initialization of this smooth refinement procedure. But, due to the highly nonconvex nature of the objective function, we have to be content with an algorithmic simplification of this initialization at the cost of the possibility for ending the estimation in a local minimum of λ. In order to put the proposed estimation routine in context, we start by shortly outlining the procedure used by RIMAX to find good initial estimates.
1) Adding New Paths Under the Narrowband Assumption: The first simplification imposed by RIMAX is that an initial guess is built up incrementally by using an efficient method for adding a single new path to the estimate. Second, after a new path was added to the estimate, all path parameters are refined independently by employing the SAGE algorithm. The efficiency of RIMAX procedure relies heavily on the Kronecker or Khatri-Rao structure ofs; i.e., how the atomic functions contribute to it. This, in turn, stems from the narrowband assumption. A schematic representation of this approach is given in Fig. 2. Note that we are mainly concerned with finding accurate estimates of the ToF, Doppler, DoA, and DoD parameters, since the polarimetric path weights γ can always be estimated as the BLUE by means of least squares, which is explained in the needed generality in the Appendix. This also holds true in the case of the more complex model (16) and the respective atomic functions used therein. Moreover, the quantityr is defined similarly as in (19) viã to represent the current residual given the estimated parameters. For the initialization of a single new path, RIMAX uses the fact that each data dimension corresponds to an individual parameter dimension, such that a grid-based matched filter, where the grid density only has to obey the critical sampling rate for this measurement and parameter dimension, gives an accurate enough initial set of delay, Doppler, DoA, and DoD parameters. Between adding paths, each execution of SAGE further refines all currently present paths. After completing the iteration in Fig. 2, the smooth gradient optimization is employed to more accurately resolve especially the highly coupled paths.
2) Adding New Paths With Coupled Parameters: However, model (16) does not allow to treat the parameters independently. Hence, the algorithmic simplifications in the previous section imposed by RIMAX cannot be justified anymore and are prone to deliver suboptimal solutions. In the proposed modification of the original RIMAX estimation, we aim at finding a compromise between the bandwidth-imposed higher model complexity and computational efficiency.
RIMAX reduces the search space from several paths at once paths to a single path. However, due to the coupling of the parameters, in order to viably initialize a single path, we have to implement the grid-based matched filter for all parameters jointly. In a full MIMO setting, this would result in a grid of size N TX · N RX · N t · N f , where N TX,RX have to be determined according to the receive and transmit antennas' aperture. For practically relevant aperture sizes and measurement resolution, the grid becomes prohibitively large to use a correlation-based matched filter for all parameters jointly.
Hence, to add a single new path, we are required to stick with the procedure put forward by RIMAX due to computational reasons. We only modify it in such a way that given a current set of parameters (γ , η), we do not user(γ , η) in the narrowband matched filter, but instead the more accurate r(γ , η), which is based on s.
Once a new path is added to the model, we carry out SAGE to refine the propagation paths independently of each other, but the respective parameters of a single path are updated jointly. This can be accomplished computationally, since the employed SAGE is executed after the global search. Hence, it may use a set of iteratively refined small local grids of magnitude at most 2 6 if all six propagation parameters are estimated. Consequently, as outlined in Fig. 2, PyMAX makes use of the more complex wideband data model (16), where it is computationally affordable and, hence, balances the trade-off between accuracy and run time optimally. After the initialization of the estimator, we turn to the iterative refinement using a smooth local optimizer, which is treated in Section III-D.

D. Fisher-Scored Parameter Refinement
The assumption of this section is that we have initial estimates of the parameters in the form of γ init and η init and wish to refine them further by directly minimizing the function λ in (20), as necessary condition for a local optimum is a vanishing Jacobian J η,γ .
If we consider the well-known Newton-iteration scheme for finding the root of the derivative of a real-valued function g : R n → R, i.e., finding a solution to J x g(x) = 0, one employs the iteration where H x g is the so-called second-order derivative, or Hessian matrix of the function g with respect to x, and ε k > 0 denotes the step size in step k. However, in genera,l the Hessian is not a positive definite matrix and, hence, not well suited to yield an iteration converging against a local minimum of g. In addition, often, the computational complexity to evaluate it analytically and explicitly is prohibitively high. But, in our case of maximum likelihood estimation, we are in the lucky position that λ(γ , η| y) is actually a random variable, when considered for fixed γ and η, since y is random. This allows us to consider the Fisher information matrix (FIM), i.e., the expectation of Hλ which is a positive definite matrix. Consequently, we wish to employ as an iteration scheme. This method is known as the natural gradient method, or Fisher-scored gradient iteration [31]. Hence, it is clear that we are in need for analytic expressions of Jλ(γ , η) and F(γ , η), which is the scope of Section III-E.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

E. Derivatives
Most generally, a function f : C m → C n with z → f (z) can also be considered as a function f : R m ×R m → R n ×R n , where a complex vector is represented by z = x + ȷ y.
Then, the first-order derivatives of a function f , namely, J z * : C m → C n×m and J z : C m → C n×m , are given as the applications the two Wirtinger derivative [32] operators We find that although not strictly necessary, the Wirtinger calculus is the natural domain to derive derivatives for functions that combine complex-valued functions to real-valued functions. Furthermore, the following indexing notation: is used to access the partial derivative of the jth function value with respect to the ith element of z. This way, both j and i can be used in expressions involving the Einstein notation. 1) Jacobian: As we have seen in Section III-D, we need an analytic expression for J γ ,η λ : C ℓ P × R q P → C ℓ P × R q P , which we are going to develop step by step below. First, we decompose γ into its real and imaginary parts via where ρ, ι ∈ R ℓ P . We slightly misuse the notation by defining J γ ,η n,n ℓ ,n q ,s s = J ρ n,n ℓ ,s s, J ι n,n ℓ ,s s, J η n,n q ,s s as a tuple of the two derivatives, and we do so as well for J γ ,η n ℓ ,n q ,s λ via J γ ,η n ℓ ,n q ,s λ = J ρ n ℓ ,s λ, J ι n ℓ ,s λ, J η n q ,s λ .
The notation we use works as follows in our specific case. For instance, the array J η n,n q ,s s denotes the first-order derivative of the nth entry of the model s with respect to the (n q , s)th entry of the parameter η. Similarly, the array J ρ n ℓ ,s λ denotes the first-order derivative of the likelihood λ with respect to the (n ℓ , s)th entry of the real part of γ .
We consider expression (20) together with the chain rule, the product rule together with the fact that ρ, ι, and η are real-valued. Then, we get J x n ℓ ,n q λ = −2ℜ r * n 1 · R −1 n 1 ,n 2 · J x n 2 ,n ℓ ,n q s where x is a wildcard for ρ, ι, and η. Let us again take a look at the used notation and how the contraction using Einstein notation works. The above product in (28) takes r * n 1 ∈ C n mimo and multiplies it with the first four dimensions indexed by n 1 of the inverse covariance R −1 ∈ C n mimo ×n mimo , and after multiplication, we sum over this pointwise product, since n 1 is missing in J x n ℓ ,n q ; also, we do the same for those dimensions indexed by n 2 and the Jacobian of s, namely, J x n 2 ,n ℓ ,n q s inC n mimo × C ℓ P × R q P . Since we sum both over the indexes in n 1 and n 2 , what remains is an array with the dimension of our parameter space.
The results above tell us to derive an expression for the components of J γ ,η s, which we can insert in (28). To this end, we first have to consider (2) and its specification for our case in (16), from which we see that the derivative is in fact structured. We observe that the summand computed from parameter η i 1 does not depend on parameter η i 2 for i 1 ̸ = i 2 and vice versa. Hence, when representing evaluations of J γ ,η s, we can condense them to J γ ,η s(γ , η) ∈ C ℓ×q×P , which then also holds true for the derivative of λ.
Keeping this in mind, the product rule leaves us with J ρ n,n ℓ ,s s = I ℓ×ℓ n ℓ ,n γ · a n a η p J ι n,n ℓ ,s s = ȷ I ℓ×ℓ n ℓ ,n γ · a n a η p J η n,n q ,s s = γ n γ ,s · J η n a ,n q a η p (31) where, in both cases, the vector product is the same as in (2), and we have n ℓ , n γ ⩽ ℓ and n a , n q ⩽ q. The only puzzle piece missing is the derivative of the atom functions with respect to η. Again, we can exploit the structure of a, which is of the form a n a (η) = a 1 n 1 a η q 1 · · · · · a k n k a η q k where the indices q 1 , . . . , q k are pairwise distinct, such that no parameter occurs twice in any atom. This structural assumption is satisfied in all our data models, since each a i is defined by a different propagation parameter.
This structural knowledge can be used together with the product rule to get J η q i a n a (η) = · · · · a i−1 n i+1 a η q i+1 · · · · (32) to finally end up with J η a : R q → C m×q given by as the derivative for the atom function a. We wish to stress the fact that the procedure to substitute (32) into (33), which is fed into (31), which is finally plugged into (28) together with (29) and (30), holds up without modification both in the case of the narrowband modelŝ as well as the more complex wideband model s. The only thing that changes is the nature of the high-dimensional vector product that combines the different atomic functions.
2) Fisher Information Matrix: In order to implement the Fisher-scored gradient method for minimizing λ, we also need an analytic expression for F. However, this is easily obtained by means of the Slepian-Bangs formula, which requires us to calculate F x, y n x ,s 1 ,n y ,s 2 = −2ℜ J x n 1 ,n x ,s 1 s · R −1 n 1 ,n 2 · J y n 2 ,n y ,s 2 s where again x and y are (possibly equal) wildcards for ρ, ι, and η. Hence, in our specific case of a Gaussian log-likelihood function as objective, where we can readily use (29)-(31) to calculate the FIM with the expression in (34). With these results at hand, we have provided the necessary building blocks to implement a maximum likelihood estimator, as outlined in Fig. 1, and we can move to numerical experiments in Section IV.

A. Wideband Antenna Responses
In Section II-C1, we present a new methodology how to derive an analytic, compact, and efficient representation of far-field antenna beam patterns. Here, we want to showcase some numerical experiments to demonstrate the validity of the approach, also compared with other methods.
1) Simulated Antenna Data: As a first study, we consider an SPUCPA with two rings containing eight elements each, whereas each element contains two feeding ports, such that we end up with 32 as the total number of ports for the array structure. Also, the wideband farfields are simulated 3 using Ansys High Frequency Electromagnetic Simulation Software (Ansys HFSS) from 25.0 up to 35.0 GHz with 50.0-MHz uniform sampling distance for two orthogonal excitation polarisations. In the angular domain, samples are taken with a 1 • spacing both in elevation and azimuth directions. This makes the complete data in (9) to be u ∈ C 181×360×201×2×32 . In Fig. 3 (top), we plot the absolute value of the sampled transfer function for a single element in its mainbeam direction depending on the frequency. In addition, we use the proposed interpolation given in Section II-C1 and compare it with the ETADF [14] in terms of the resulting interpolation error. For the calculations in Section II-C1, we use ω = 2 and Q = 11, where for the ETADF, we truncate the IR in time domain to Q + 1 samples.
While both methods perform similarly in the center of the frequency band, the interpolation error of the ETADF gets more severe at the border of the simulated domain, which stems from the implicit assumption that the beam pattern is periodic with a periodicity of 10.0 GHz, which is violated for this antenna structure. Furthermore, it is presumably never satisfied for any antenna, unless the measurements or simulations were carried out on such a large bandwidth that tapering is not affecting the frequency domain of interest or that the transfer function has decayed enough in absolute value at the edges, such that the periodification assumption is not violated significantly.
Although the effect of the proposed method is not highly significant for the nearly perfect simulation data, we will see, in Section IV-A2, that for measurement data, the denoising and also time-gating capabilities of the FIR model are substantial.
2) Antenna Measurement Data: Now, we would like to study a more realistic scenario, where measurement data were collected in an anechoic chamber and have to be transformed into an analytic description of the antenna to be used in estimation.
To this end, we use measurement data of an antenna array structure consisting of six dual-linear aperture coupled stacked patch polarized elements arranged in a regular circular structure (see Fig. 4) with a parallel side distance of 55.5 mm. Thus, the elements are largely spaced in manner of a single wavelength and show a matching bandwidth (back-reflection coefficient S11 less than −10.0 dB) of 21.5 to 44.0 GHz.
The measurements are performed in the anechoic chamber of the Fraunhofer Institute for Integrated Circuits IIS, Erlangen. As electrical measurement setup, we employ a θ-over-φ positioner and probe stand that can be rotated to automatically measure the pattern of the AUT for two orthogonal polarisations. The NSI2000 software from NSI-MI Technologies is used to control and synchronize the Rohde & Schwarz ZVA 40 Network Analyzer, probe stand, and the AUT positioning system. Spacek Labs SLKKA-30-6 amplifier (AMP1) is added to increase signal-to-noise ratio (SNR) of the measurement.
Initially, the relative radiation patterns of each of the individual antenna elements are measured sequentially, while the unused ports are equipped with 50.0-terminations. In order to obtain absolute gain and phase characteristics, the calibration of the data is done with one of the elements of the AUT itself. For that purpose, amplitude and phase in main beam direction of that single element are deduced with a threeantenna method [33] beforehand.
The measurement data consists of 451 equally spaced samples in a range from 22.0 to 40.0 GHz, resulting in a sampling distance of 40.0 MHz. For each of the frequencies, each element's far-field is sampled for two orthogonal polarization components as well as on a regular angular grid with 61 uniform samples in co-elevation ranging from 0 • to 180 • , and 121 uniform azimuthal ranging from −180 • to +180 • , resulting in a regular 2-D sampling using 3 • steps in both angular directions. Hence, in (9), we have u ∈ C 61×121×451×2×12 at our disposal.
In Fig. 5, we show the absolute value of the transfer function of a single port in its mainbeam direction together with the resulting interpolation by means of the methodology presented in Section II-C1 in frequency domain. In order to choose the quantities Q and ω used in (11), we set Q = 9 to account for the electric size of the antenna and possible multiple reflections within the geometry of the antenna. The parameter ω = 2 is chosen, such that the resulting ringing at the border of the measured frequency domain was deemed negligible.
First, we notice in Fig. 5 that in frequency domain, the transfer function is denoised substantially in the sense that it is well approximated by a slowly varying function. At the same time, we remove the fading process stemming from weak but noticeable multipath components in the chamber [34]. This directly translates to an effective separation of the antenna from the other components present in the measurements in time domain, e.g., measurement noise and fading due to the elimination of unwanted reflections in the chamber. This separation is possible, since the antenna only has very limited electrical size; hence, it can only be occupying a limited window in delay domain.
The described effect is similar to the spatial aperture reduction as proposed in [30], where the antenna is separated in the angular domain from possibly existing clutter, by removing the high-frequency components in the angular domain. These high-frequency components cannot originate from the physically limited aperture of the antenna, such that they can only stem from reflections on objects in the measurement chamber and, hence, should be removed from the calibration data, and our proposed approach allows the combination of both filtering approaches.
B. Application to mmWave Channel Measurements 1) Estimation With Wideband Antenna Models: In this section, we would like to study the effects and benefits of making use of the more complex, but also more accurate antenna model that takes the frequency dependence, i.e., dispersion, of the involved antennas into account.
For the analysis, we use synthetic data that are generated based on ray-tracing results obtained for a single input-multiple output (SIMO) indoor scenario, and the synthetic measurement was simulated according to an accurate system model provided by National Institute of Standards and Technology (NIST). 4 More specifically, the model incorporates the frequency and polarization-dependent amplitude and phase information of the open-ended waveguide distributed over a synthetic aperture. The synthetic aperture consists of three uniform rectangular arrays (URAs) located at the same position, but rotated by 120 • with respect to each other in azimuth to cover the whole azimuthal range and avoid forward-backward ambiguity during estimation. We use three 35 × 35 uniform rectangular synthetic apertures with λ/2 spacing of the individual elements at 40.0 GHz acquired by the so-called synthetic aperture measurements of uncertainty in angle of incidence (SAMURAI) system [35]. The underlying ray-tracing simulations were carried out for a dense industrial environment, with >80 multipath components, where some are closely clustered. In addition, the ground-truth paths were contaminated with >300 weak components around the stronger multipath components to approximate diffuse scattering.
For a comparison, we carry out estimations for different bandwidths and for two different models. We compare the results of the modelss and s, where for s, we employed the proposed wideband antenna model in Section II-C1 at the RX. In both cases, we estimated a fixed number of ten paths in order to make the estimates comparable. Note that using the models essentially results in the conventional RIMAX estimator [36]. In both cases, we did not model any stochastic/dense multipath components by virtue of R and instead assumed the covariance of the data to be a scaled identity matrix.
As a metric, we use the so-called relative residual power defined by which accounts for the power of the data not captured by the used model w. Table I suggests that the wideband antenna model is, in fact, necessary to get consistent estimation results over a larger bandwidth. The estimation performance in terms of p is steadily decreasing for the model based on the narrowband assumption. In contrast, the subtracted power when using s is stable across all bandwidths.
Although, in the considered bands, the open-ended waveguide is fairly frequency flat in amplitude and phase, it is necessary to account for the relative spatial shifts of the measurement positions in the synthetic aperture by the naturally frequency-dependent phase shifts. Due to the large extent of the aperture, the phase shifts for the bordering elements of the rectangle for the center frequency differ tremendously from those shifts obtained at the ends of the considered bands. The resulting model mismatch already becomes significant at 1.00 GHz of bandwidth.
2) Estimating Reflecting Objects' Transfer functions: In the following analysis of the proposed PyMAX estimator, we use measurements from an indoor scenario ranging from 27.0 to 33.0 GHz resulting in 6.00 GHz of bandwidth. Here, we use the single 35×35 uniform rectangular synthetic apertures with λ/2 spacing of the individual elements at 40.0 GHz acquired by the so-called SAMURAI system [35]. See Fig. 6 for the measurement setup. Note that based on a rule of thumb that the phase error due to the curvature of spherical waves should not exceed 8 • , we have concluded that the scatterers are located in a region where this assumption is still violated. Hence, this is an inherent model mismatch of the setup, which biases the estimators' results. During the estimation process, we use the wideband antenna model proposed in Section II-C1 using an overlap ω = 2 and number of sinc functions Q = 8 to interpolate the antenna measurements of the open-ended waveguide taken over the whole bandwidth in 500-MHz steps to the measurement frequency grid with a distance of 10.0 MHz. We use the measurements of the antenna on a 2 • -spaced grid both in co-elevation and azimuth. In addition, to account for late responses stemming from the room containing the measurement setup, we employed a newly developed multimodal estimation routine for the DMC [29].
For the results depicted in Fig. 7, we use two different models for the objects' transfer function. At first, we run the estimator that models the reflecting objects as point sources, i.e., as frequency flat, as outlined in Section II-B. Hence, only the dispersiveness of the antenna was taken into account during the estimation process. Second, we use a model that interpolates the transfer functions, as outlined in Section II-C2, where we choose σ = S = 1. In other words, we assume the reflecting objects to be only mildly frequency-dependent or their IRs to be concentrated in time domain to a small time interval.
However, as we see in Fig. 7, the effect significantly influences the estimation results. If we impose point-like reflecting objects, we observe that one of them is modeled by two estimated paths, since the model is not flexible enough to account for the more complex behavior at 6.00-GHz bandwidth. Ultimately, this results in one reflection not being detected by the estimator. In contrast, when using the model that interpolates the reflecting objects' transfer function, we see that the correct parameters are recovered from the measurements, and all targets are detected. The visible biases in both cases can stem from calibration insufficiencies and measurement noise.

V. CONCLUSION
This article presented more accurate propagation and device data models, which are used for model-based parameter estimation of radio channel measurements, their implications on the algebraic structure of these models, and algorithmic aspects arising in this more challenging setting, and we have presented numerical justification for these methods based on simulation and measurement data.

A. Summary
We enhance the RIMAX parameter estimation framework mostly by including proper frequency responses. The advantage is shown most clearly with high bandwidth in the mmWave and sub-THz range. At first, we extend the EADF antenna data model proposed in [3] with a frequency dimension. We have shown that the deterioration from neglecting the frequency response becomes bigger not only with bandwidth and distance between neighboring elements (if it becomes bigger than half λ, which is often the case with waveguide antennas in a circular architecture). Rather it depends also on the size of the array (the number of antennas), hence the distance between the outermost elements of the array.
Moreover, with the FIR filter, which is applied in time domain, we have a well-defined (and usually small) number of coefficients to describe the frequency response of the antenna. This allows efficient frequency domain interpolation. Together with the spatial FIR filter model described in [3], we end up with a compact (minimum size) 3-D EADF antenna model containing one set of temporal coefficients and two sets of spatial coefficients, which can be interpolated and differentiated in any direction and dimension. Eventually, we can apply this finite representation for gating the data in all the three aperture dimensions if we have to get rid of parasitic multipath during antenna calibration measurement. Second, we made another extension to the propagation data model. Normally, we assume point-like objects with uniform (constant) frequency response. However, if the interacting object extends over several delay bins, we propose a nonparametric coherent scattering model that assigns a short delay spread and a frequency response to the propagation path that associates with the respective object.
For bandwidths of 6.00 GHz at 30.0-GHz center frequency, the numerical experiments carried out with measurement data suggest that it can be of advantage to estimate the transfer functions or delay response of the individual interacting object. This may allow for object identification and also reduces probability for estimation of artifacts, such as path splitting or "ghost sources" [3].

B. Outlook
However, this FIR model for the interacting objects' transfer function introduces a nontrivial model order selection problem that ideally has to be solved on a per-object basis. Also, depending on the intent behind, the parameter estimation different outcomes might be desirable. This motivates a more thorough study of these new possibilities. Related to the FIR approach, one should aim at extending the model to also account for IIR responses. This might bear the advantage that one can describe resonant scatterers more efficiently.
Moreover, the observed phase dispersion caused by a larger antenna aperture has to be studied more carefully for larger relative bandwidths. The ramifications in terms of modeling, simulation, and estimation may be plentiful and interesting.
One may ask if it is necessary to model all objects of the environment in this way as extended objects, especially in a multipath rich environment with many superimposing contributions? This is not the case. It seems to be more important in environments with a few dominant objects, e.g., interacting cars in a closer vicinity of a direct car2car scenario. Another application is objects with outstanding importance. This can apply if radar sensing capabilities are integrated with communications [integrated communications and sensing (ICAS)]. Then, recognition and identification of dominant objects in an otherwise uninteresting environment (clutter) are the goal.
For relative and absolute bandwidths reaching the levels of ultrawideband (UWB) systems, the Doppler-effect, which only under the narrowband assumption corresponds to a frequency shift, also becomes a frequency-dependent quantity, and the effect of moving targets in these scenarios should be accounted for as well. The resulting effects and necessary postprocessing steps are out of the scope of this publication, but are interesting and complex topics for future work.
One caveat of the proposed PyMAX estimator is the search of a new specular component using the simplified narrowband model, which uses the Kronecker factorization. Due to this model mismatch during the initial search, which we have demonstrated to become increasingly severe, the initial guesses will eventually be too biased away from the actual paths that ghost sources will become unavoidable, and hence, a routine without the narrowband assumption is indispensable. Possibly, the recent advances in deep learning-based parameter estimation can help to approximate a solution to this inherently nonconvex problem. APPENDIX Considering (2), we realize that the parameters γ play a different role than η. This is due to the fact that given η in both models, it is possible to determine the BLUEγ . If both the data y and the parameter γ were vectorized, the BLUE mapping y toŷ would be realized by the socalled Moore-Pensore inverse of the matrix mat(a n a (η p )), with a n a (η p ) as in (2).
The requirement here is that y follows a standard Gaussian distribution. If this is not the case, one has to first estimate the covariance of y and then apply suitable prewhitening before calculating the BLUE of γ .
However, since we do not execute any vectorizations in both of our data models (5) or (16), we need to transfer the Moore-Penrose inverse to this more involved scenario.
Assume we have already determined estimates of η ∈ R q p ; then, the solutions of miñ γλ γ , η and min γλ (γ , η) (36) can be determined by evaluating linear functionsg : C n mimo → Cl p and g : C n mimo → C ℓ p with y →g( y) and y → g( y). This means, we can find tensorsM ∈ Cl p ×n mimo and M ∈ C ℓ p ×n mimo , such that where the tensor-vector products· and · in above equations have to be determined based on the specific version of (2). Hence, we skip the derivation of analytic expressions forg and g with the note that they can be derived analogously as the expressions of the Moore-Penrose matrices.
In Section II-C1, we reduce the description of the angle-dependent transfer function of the measurement device, i.e., the antenna's polarimetric beam pattern, to a superposition of linearly weighted sinc functions. The weights of these sinc functions are still angle-dependent. In order to interpolate the coefficients over angle via the original single-frequency EADF, they need to satisfy an angular bandlimitation.
The following result essentially states that if, for all measured frequencies during the calibration, the respective single-frequency beam pattern is spatially bandlimited, so will be the resulting sinc coefficients.
are bandlimited as well.
Proof: For each angle pair (ϑ, ϕ), the solution b is a linear superposition of the values of a, since the solution of least squares is a linear function on the input data.
Hence, the uniform bandlimit that holds for a also holds for b.
Consequently, it is valid to use the EADF or any other angular interpolation method for the resulting sinc coefficients.