Practical Hybrid Beamforming for Millimeter Wave Massive MIMO Full Duplex with Limited Dynamic Range

Full Duplex (FD) radio has emerged as a promising solution to increase the data rates by up to a factor of two via simultaneous transmission and reception in the same frequency band. This paper studies a novel hybrid beamforming (HYBF) design to maximize the weighted sum-rate (WSR) in a single-cell millimeter wave (mmWave) massive multiple-input-multiple-output (mMIMO) FD system. Motivated by practical considerations, we assume that the multi-antenna users and hybrid FD base station (BS) suffer from the limited dynamic range (LDR) noise due to non-ideal hardware and an impairment aware HYBF approach is adopted by integrating the traditional LDR noise model in the mmWave band. In contrast to the conventional HYBF schemes, our design also considers the joint sum-power and the practical per-antenna power constraints. A novel interference, self-interference (SI) and LDR noise aware optimal power allocation scheme for the uplink (UL) users and FD BS is also presented to satisfy the joint constraints. The maximum achievable gain of a multi-user mmWave FD system over a fully digital half duplex (HD) system with different LDR noise levels and numbers of the radio-frequency (RF) chains is investigated. Simulation results show that our design outperforms the HD system with only a few RF chains at any LDR noise level. The advantage of having amplitude control at the analog stage is also examined, and additional gain for the mmWave FD system becomes evident when the number of RF chains at the hybrid FD BS is small.


I. INTRODUCTION
T HE revolution in wireless communications has led to an exponential increase in the data rate requirements and number of users. The millimeter wave (mmWave) frequency band 30 − 300 GHz can accommodate the everincreasing data demands and results to be a vital resource for future wireless communications [1]. It offers much wider bandwidths than the traditional cellular networks, and the available spectrum at such higher frequencies is 200 times greater [2]. Full Duplex (FD) communication in mmWave has the potential to further double the spectral efficiency by offering simultaneous transmission and reception in the same frequency band. Moreover, it can be beneficial for efficient management of the vast mmWave spectrum, reducing endto-end delays/latency, enabling advanced joint communication and sensing, and solving the hidden node problem [3]- [6].
Self-interference (SI), which can be 90 − 110 dB higher than the received signal [7], [8], is a key challenge to achieve an ideal FD operation. Given the tremendous amount of SI, signal reception is impossible without a proper SI cancellation scheme. Beamforming is a powerful tool for FD to mitigate the SI while serving multiple users and can lead to a significant performance gain compared to a half duplex (HD) system [9]- [18]. However, its gain in practical communication systems is restricted by the limited dynamic range (LDR) of the radiofrequency (RF) chains [12]. The signal may suffer from LDR noise due to the distortions introduced by non-ideal power amplifiers (PAs), analog-to-digital-converters (ADCs), digitalto-analog-converters, mixers and low noise PAs. These impairments dictate the residual SI power which cannot be cancelled Chandan Kumar Sheemar and Dirk Slock are with the Communication Systems Department at EURECOM, Sophia Antipolis, 06410, France (emails:sheemar@eurecom.fr,slock@eurecom.fr); Christo Kurisummoottil Thomas is with Qualcomm Finland RFFE Oy, Keilaranta 8, 02150 Espoo (e-mail: ckurisum@qti.qualcomm.com). and therefore establish the achievable gain for FD [12]. This adverse effect urges the requirement of impairment aware beamforming designs and investigating their performance in terms of the LDR noise levels such that correct conclusions on the achievable gain of FD could be drawn. Such an approach for the fully digital FD systems can be adopted with the well-established LDR noise model available in [10]- [18]. In general, impairment aware beamforming is more robust to distortions and can significantly outperform the naive schemes [19], [20], see, e.g., [20,Figure 2].
The deployment of multi-user mmWave FD systems requires the FD base stations (BSs) to be equipped with a massive number of antennas to overcome the propagation challenges. Owing to the hardware cost, they will have to rely on a hybrid architecture consisting of only a few RF chains. Therefore, efficient hybrid beamforming (HYBF) schemes are required for such transceivers to manage the SI and interference jointly by performing large-dimensional phasor processing in the analog domain and lower-dimensional digital processing.

A. State-of-the-art and Motivation
In [21]- [27], novel HYBF designs for a point-to-point mmWave massive MIMO (mMIMO) FD system are studied. HYBF schemes of mMIMO FD relays and integrated access and backhaul are presented in [28]- [30] and [31], respectively. HYBF designs with single antenna uplink (UL) and downlink (DL) users for a single-cell and a multi-cell mmWave FD system are proposed in [32] and [33], respectively. In [34], HYBF for mmWave mMIMO FD with only one UL and one DL multi-antenna user, under the receive LDR is proposed. In [35], HYBF for two fully connected mMIMO FD nodes that approaches SI-free sum-spectral efficiency is proposed. In [36], HYBF for a mmWave FD system equipped with analog SI cancellation stage is presented. In [37], HYBF to generalize the point-to-point mmWave mMIMO FD communication to the case of a K-pair links is presented. Frequency-selective HYBF for a wide-band mmWave FD system is studied in [38].
The literature on multi-antenna multi-user mmWave FD systems is limited only to the case of one UL and one DL user [34]- [36], [38]. In [34], the receive side LDR of FD BS is also considered, which is dominated by the quantization noise of the ADCs. However, LDR noise from the transmit side is ignored, which also affects the performance of FD systems significantly [39]. The effect of cross-interference generated from the UL user towards the DL user is also not considered in [34], which can have a major impact on the achievable performance. Cross-interference generated from the neighbouring cells is well investigated in the dynamic timedivision-duplexing networks [40]- [44], and it is more harmful to the multi-user FD systems as it occurs in the same cell. For example, consider the case of a small cell, in which BSs and users are expected to operate with a similar amount of transmit power [44]. Suppose that one FD BS simultaneously serves one UL and one DL user and that both the users are located close to each other and sufficiently far from the BS. In such a case, cross-interference can become as severe as the SI and can completely drown the useful signal intended for the DL user if not considered in the beamforming design. In a multi-user scenario with multiple UL users located near the DL users, each DL user suffers from cross-interference, which is summed over all the UL users' transmit power, with each UL user transmitting with a similar amount of power as the BS. In such a case, cross-interference can become even more severe than the SI if not considered in the design.

B. Main Contributions
We present a novel HYBF design to maximize the weighted sum-rate (WSR) in a single-cell mmWave mMIMO FD system, i.e., for multiple multi-antenna UL and DL users. The users are assumed to have a limited number of antennas and digital processing capability. The FD BS is assumed to have a massive number of antennas and hybrid processing capability. Our design is based on alternating optimization and relies on the mathematical tools offered by minorization-maximization [45]. The users and BS are assumed to be suffering from the LDR noise due to non-ideal hardware, modelled with the traditional LDR model [12] and by extending it to the case of a hybrid transceiver, respectively. Our work represents the firstever impairment aware HYBF approach for mmWave FD and its analysis as a function of the LDR noise levels. Extension of the LDR noise model presented herein is applicable to any mmWave FD scenario.
In contrast to the conventional HYBF designs for mmWave FD, in this work, the beamformers are designed under the joint sum-power and the practical per-antenna power constraints. The sum-power constraint at each terminal is imposed by the regulations, which limits its total transmit power. In practice, each transmit antenna is equipped with its PA 1 [47] and the 1 The mMIMO systems are also expected to be deployed with one PA perantenna to enable the deployment of very low-cost PAs [46]. per-antenna power constraints arise due to power consumption limits imposed on the physical PAs [47]- [51]. We also present a novel SI, interference, cross-interference and LDR noise aware optimal power allocation scheme to meet the joint constraints.
Compared to the digital part, optimization of the analog stage is more challenging as it must obey the unit-modulus constraint. Recently, new transceivers have started to emerge, which with the aid of amplitude modulators (AMs), also allow amplitude control for the analog stage [34], [52], [53]. Such transceivers alleviate the unit-modulus constraint but require additional hardware. Hence, we study both the unit-modulus and AMs cases and investigate when the amplitude control for mmWave FD could be advantageous. In practice, as the analog beamformer and analog combiner can assume only finite values, a quantization constraint is also imposed on them during the optimization process. In our problem formulation, the WSR does not depend on the digital combiners, which are omitted in the design. They must be chosen as the minimummean-squared-error (MMSE) combiners after the convergence of the proposed algorithm. By omitting the digital combiners, equal to the sum of the number of UL and DL users, the HYBF design simplifies, and the per-iteration computational complexity reduces significantly.
Simulation results show that our design outperforms a fully digital HD system and can deal with the SI, interference and cross-interference with only a few RF chains. Results are reported with different LDR noise levels, and significant performance gain is observed at any level.
In summary, the contributions of our work are: • Extension of the LDR noise model for the mmWave band.
• Introduction of the WSR maximization problem formulation for HYBF in a single-cell mmWave mMIMO FD system affected by the LDR noise. • A novel SI, interference, cross-interference, LDR noise and the practical per-antenna power constraints aware HYBF design. • Investigation of the achievable WSR in a multi-user mmWave FD system as a function of the LDR noise. • Optimal interference, SI, LDR noise and the per-antenna power constraints aware power allocation scheme for the hybrid FD BS and UL users. Paper Organization: The rest of the paper is organized as follows. Section II presents the system model, problem formulation and extends the LDR noise model. Sections III and IV present the minorization-maximization method and a novel HYBF design, respectively. Finally, Sections V and VI present the simulation results and conclusions, respectively.
Mathematical Notations: Boldface lower and upper case characters denote vectors and matrices, respectively. E{·}, Tr{·}, (·) H , (·) T , ⊗, I, D d and i denote expectation, trace, conjugate transpose, transpose, Kronecker product, identity matrix, d dominant vectors selection matrix and the imaginary unit, respectively. vec(X) stacks the columns of X into a vector x and unvec(x) reshapes x into X. ∠X and ∠x return the unit-modulus phasors of X and the unitmodulus phasor of x, respectively. Cov(·) and diag(·) denote the covariance and diagonal matrices, respectively. SVD(X) returns the singular value decomposition (SVD) of X. Element of X at the m-th row and n-th column is denoted as X(m, n). Vector of zeros of size M is denoted as 0 M ×1 . Operators |X| and |x| return a matrix of moduli of X and the modulus of scalar x, respectively.

II. SYSTEM MODEL
We consider a single-cell mmWave FD system consisting of one hybrid FD BS serving J DL and K UL fully digital multiantenna users, as shown in Fig. 1. We assume perfect channel state information (CSI) 2 . The FD BS is assumed to have M t transmit and N r receive RF chains, and M 0 transmit and N 0 receive antennas. Let U = {1, ..., K} and D = {1, ..., J} denote the sets containing the indices of K UL and J DL users, respectively. Let M k and N j denote the number of transmit and receive antennas for k-th UL and j-th DL user, respectively. We consider a multi-stream approach and the number of data streams for k-th UL and j-th DL user are denoted as u k and v j , respectively. Let U k ∈ C M k ×u k and V j ∈ C Mt×vj denote the precoders for white unitary variance data streams s k ∈ C u k ×1 and s j ∈ C vj ×1 , respectively. Let G RF ∈ C M0×Mt and F RF ∈ C N0×Nr denote the fully connected analog beamformer and combiner at the FD BS, respectively. Let P = {1, e i2π/nps , ..., e i2πnps−1/nps } denote the set of n ps possible discrete values that the phasors at the analog stage can assume on unit-circle.
For HYBF with the unit-modulus constraint, we define the quantizer function Q P (·) to quantize the unit-modulus phasors of analog beamformer G RF and combiner F RF such that Q P (∠G RF (m, n)) ∈ P and Q P (∠F RF (m, n)) ∈ P, ∀m, n. For HYBF with amplitude control, the phase part is still quantized with Q P (·) and belongs to P. Let A = {a 0 , ...., a A−1 } denote the set of A possible values that the amplitudes can assume. Let Q A (·) denote the quantizer function to quantize the amplitudes of G RF and F RF 2 The CSI of the mmWave FD systems can be acquired similarly as in [54] for the mmWave HD system and it is part of the ongoing research [55]. such that Q A (|G RF (m, n)|) ∈ A and Q A (|F RF (m, n)|) ∈ A, ∀m, n. A complex number G RF (m, n) with amplitude in A and phase part in P can be written as G RF (m, n) = Q A (|G RF (m, n)|)Q P (∠G RF (m, n)). The thermal noise vectors for FD BS and j-th DL user are denoted as n 0 ∼ CN (0, σ 2 0 I N0 ) and n j ∼ CN (0, σ 2 j I Nj ), respectively. Let c k and e j denote the LDR noise vectors for k-th UL and j-th DL user, respectively, which can be modelled as [12] where k k 1, β j 1, Φ j = Cov(r j ) and r j denotes the undistorted signal received by j-th DL user. Let c 0 and e 0 denote the LDR noise vectors in transmission and reception for FD BS, respectively. We model them as where k 0 1, β 0 1, Φ 0 = Cov(r 0 ) and r 0 denotes the undistorted signal received by FD BS after the analog combiner F RF . Note that (3) extends the transmit LDR noise model from [12] to the case of a hybrid transmitter. For the hybrid receiver at the mmWave FD BS, ADCs, the most dominant sources of receive LDR noise, are placed after the analog combiner F RF . Consequently, e 0 in (4) considers the undistorted signal received after the analog combiner. We remark that the extension presented in (3)-(4) is slightly simplified. In practice, as some circuitry might be shared among multiple antennas, it can lead to some correlation.
Let y and y j denote the signals received by the FD BS and j-th DL user, respectively, which can be written as The matrices H k ∈ C N0×M k and H j ∈ C Nj ×M0 denote channel response from the k-th UL user to BS and from the BS to j-th DL user, respectively. The matrices H 0 ∈ C N0×M0 and H j,k ∈ C Nj ×M k denote SI channel response for FD BS and cross-interference channel response between k-th UL and j-th DL users, respectively. At the mmWave, the channel response H k can be modelled as [23]   ) and a T t (θ np,nc k ) denote the receive and transmit antenna array response with angle of arrival (AoA) φ np,nc k and angle of departure (AoD) θ np,nc k , respectively. The channel matrices H j and H j,k can be modelled similarly as in (7). The SI channel can be modelled as [23] where κ denotes the Rician factor, and the matrices H LoS and H ref denote the line-of-sight (LoS) and reflected contributions, respectively. The channel matrix H ref can be modelled as (7) and element of H LoS at the m-th row and n-th column can be modelled as [23] H LoS (m, n) = ρ r m,n e −i2π rm,n λ .
where ρ denotes the power normalization constant to assure E(||H LoS (m, n)|| 2 F ) = M 0 N 0 and λ denotes the wavelength. The scalar r m,n denotes distance between the m-th receive and n-th transmit antenna, which depends on the transmit and receive array geometry (9) [23]. The aforementioned notations are summarized in Table I.

A. Problem Formulation
Let k and j denote the indices in sets U and D without the elements k and j, respectively. The received (signal plus) interference and noise covariance matrices from UL user k ∈ U at the BS and by the DL user j ∈ D are denoted as (R k ) R k and (R j ) R j , respectively. Let T k , ∀k ∈ U, and Q j , ∀j ∈ D, defined as denote the transmit covariance matrices from UL user k ∈ U and DL user j ∈ D, respectively. By considering the distortions from non-ideal hardware with the extended LDR noise model, cross-interference, interference and SI, the received covariance matrices at the BS after the analog combiner, i.e., R k and R k , and at the DL user j ∈ D, i.e., R j and R j , can be written as (11), shown at the top of the next page. In (11), S k and S j denote the useful received signal covariance matrices from k-th UL user at the FD BS and by j-th DL user, respectively. The undistorted received covariance matrices can be recovered from (11) The WSR maximization problem with respect to the digital beamformers, analog beamformer and combiner with amplitudes in A and phase part in P, under the joint sum-power and per-antenna power constraints, can be stated as ∠G RF (m, n) ∈ P, and |G RF (m, n)| ∈ A, ∀ m, n, (12f) ∠F RF (i, j) ∈ P, and |F RF (i, j)| ∈ A, ∀ i, j.
(12g) The scalars w k and w j denote rate weights for the UL user k and DL user j, respectively. The diagonal matrices Λ k and Λ 0 denote per-antenna power constraints for the k-th UL user and FD BS, respectively, and the scalars α k and α 0 denote their sum-power constraint. The collections of digital UL and DL beamformers are denoted as U and V , respectively. For unit-modulus HYBF, the constraints in (12f) − (12g) on the amplitude part become unit-modulus.
Remark 1: Note that the rate achieved with (12) is not affected by the digital receivers if they are chosen as the MMSE combiners, see e.g., (4) − (9) [56] for more details. For WSR maximization, only the analog combiner has to considered in the optimization problem as it affects the size of the received covariance matrices from UL users, i.e., the UL rate.

III. MINORIZATION-MAXIMIZATION
Problem (12) is non-concave in the transmit covariance matrices T k and Q j due to the interference terms and searching its globally optimum solution is very challenging. In this section, we present the minorization-maximization optimization method [45] for solving (12) to a local optimum.
The WSR maximization problem (12) will be reformulated at each iteration as a concave reformulation with its minorizer, using the difference-of-convex (DC) programming [57] in terms of the variable to be updated, while the other variables will be kept fixed. To proceed, note that the WSR in (12) can be written with the weighted-rate (WR) of user k ∈ U, user j ∈ D, WSRs for k and j as where WSR U L and WSR DL denote the WSR in UL and DL, respectively. Considering the dependence of the transmit covariance matrices, only WR U L k is concave in T k , meanwhile WSR U L k and WSR DL are non-concave in T k , when T k and Q j , ∀j ∈ D, are fixed. Similarly, only WSR DL j is concave in Q j and non-concave in WSR DL j and WSR U L , when Q j and T k , ∀k ∈ U, are fixed. Since a linear function is simultaneously convex and concave, DC programming introduces the first order Taylor series expansion of WSR U L k and WSR DL in T k , aroundT k (i.e. around all T k ), and of WSR DL j and WSR U L in Q j , aroundQ j (i.e. around all Q j ). LetT and Q denote the set containing all suchT k andQ j , respectively. LetR k (T ,Q),R k (T ,Q),R j (T ,Q), andR j (T ,Q) denote the covariance matrices R k , R k , R j and R j as a function of T andQ, respectively. The linearized tangent expressions for each communication link by computing the gradientŝ with respect to the transmit covariance matrices T k and Q j can be written as (15d) We remark that the tangent expressions (15a)-(15d) consti-tute a touching lower bound for WSR U L k , WSR DL j , WSR DL and WSR U L , respectively. Hence, the DC programming approach is also a minorization-maximization approach, regardless of the restatement of the transmit covariance matrices T k and Q j as a function of the beamformers. Theorem 1. The gradientsÂ k andB k which linearize WSR U L k and WSR DL , respectively, with respect to T k , ∀k ∈ U, and the gradientsĈ j andD j which linearize WSR DL j and WSR U L , respectively, with respect to Q j , ∀j ∈ D, with the first order Taylor series expansion are given in (16).

A. Concave Reformulation
In this section, we simplify the non-concave WSR maximization problem (12). By using the gradients (16), (12) can be reformulated as (17), given at the top of the next page. Lemma 1. The WSR maximization problem (12) for a singlecell mmWave FD system with multi-antenna users reformulated at each iteration with its first-order Taylor series expansion as in (17) is a concave reformulation for each link.
Proof. The optimization problem (12) restated as in (17) for each link is made of a concave part, i.e., log(·), and a linear part, i.e., Tr(·). Since a linear function is simultaneously concave and non-concave, (17) results to be concave for each link.
, denote diagonal matrices containing the Lagrange multipliers associated with per-antenna power constraints for the FD BS and UL user k, respectively. Let l 0 and l 1 , ..., l K denote the Lagrange multipliers associated with the sum-power constraint for FD BS and K UL users, respectively. Let Ψ denote the collection of Lagrange multipliers associated with the per-antenna power constraints, i.e., Ψ 0 and Ψ k , ∀k ∈ U. Let L denote the collection of Lagrange multipliers associated with the sum-power constraints. Augmenting the linearized WSR maximization problem (17) with the sum-power and practical per-antenna power constraints, yields the Lagrangian (18), given at the top of this page. In (18), unconstrained analog beamformer and combiner are assumed and their constraints will be incorporated later.

IV. HYBRID BEAMFORMING AND COMBINING
This section presents a novel HYBF design for a multi-user mmWave mMIMO FD system based on alternating optimization. In the following, optimization of the digital beamformers, analog beamformer and analog combiner is presented into separate sub-sections. We will assume the other variables to be fixed during the alternating optimization process while updating one variable. Information of the other variables updated during previous iterations will be captured in the gradients.

A. Digital Beamforming
To optimize the digital beamformers, we take the derivative of (18) with respect to the conjugate of U k and V j , which leads to the following KKT conditions Proof. Please see Appendix B.
The generalized dominant eigenvector solution provides the optimized beamforming directions but not power [57]. To include the optimal stream power allocation, we normalize the columns of digital beamformers to unit-norm. This operation preserves the optimized beamforming directions and allows to design the optimal power allocation scheme.

B. Analog Beamforming
This section presents a novel approach to design the analog beamformer for hybrid FD BS in a multi-user scenario to maximize the WSR. The structure of the fully connected analog beamformer G RF is shown in Figure 2. Assuming the remaining variables to be fixed, we first consider the optimization of unconstrained analog beamformer G RF as max.
Note that from (17) only the terms shown in (21) depend on the analog combiner G RF and information about other variables is captured in gradientsĈ j andD j . To solve (21), we take its derivative with respect to the conjugate of G RF , which yields the following KKT condition Given (22), the analog beamformer G RF for mmWave FD BS can be optimized as stated in the following.
where D 1 (X) selects the first generalized dominant eigenvector from matrix X.
Proof. Please see Appendix B.
Note that Theorem 3 provides the optimized vectorized unconstrained analog beamformer G RF and we need to reshape it with unvec(vec(G RF )). To satisfy the unit-modulus and quantization constraints, we do G RF (m, n) = Q P (∠G RF (m, n)), ∀m, n. For HYBF with AMs, the columns are first scaled to be unit-norm and the quantization constraint is satisfied as G RF (m, n) = Q A (|G RF (m, n)|)Q P (∠G RF (m, n)), ∀m, n.

C. Analog Combining
This section presents a novel approach to design the analog combiner F RF for mmWave FD BS to serve multiple UL users. Its design is more straightforward than the analog beamformer. Note that the trace terms appearing in (17) have the objective to make beamformers' update aware of the interference generated towards other links. However, F RF being a combiner, does not generate any interference and therefore does not appear in the trace terms of (17). Consequently, to optimize F RF , we can solve the optimization problem (12) instead of using its minorized version (17). By considering the unconstrained analog combiner F RF , from (12) we have max.
To solve (24), F RF has to combine the signal received at the antenna level of hybrid FD BS but R k and R k represent the received covariance matrices after analog combining. Let (R ant k ) R ant k denote the (signal-plus) interference and noise covariance matrix received at the antennas of FD BS, which can be obtained from (R k ) R k given in (11) by omitting F RF .
After analog combining, we can recover R k and R k as R k = F H RF R ant k F RF and R k = F H RF R ant k F RF , respectively, ∀k ∈ U. Problem (24) can be restated as a function of R ant k and R ant k as max.
In (17), the trace term was only linear, which made the restated optimization problem concave for each link. In (25), all the terms are fully concave. To optimize F RF , we take the derivative with respect to the conjugate of F RF , which yields the following KKT condition It is immediate from (26) that the unconstrained analog combiner can be optimized as the generalized dominant eigenvector solution of the pair of sum of the received covariance matrices at the antenna level from all the K UL users, i.e., To satisfy the unit-modulus and quantization constraints for F RF , we do F RF (m, n) = Q P (∠F RF (m, n)) ∈ P, ∀m, n. If AMs are available, the columns are scaled to be unitnorm and quantization constraint is satisfied as F RF (m, n) = Q A (|F RF (m, n))|Q P (∠F RF (m, n)), ∀m, n.

D. Optimal Power Allocation
Given the normalized digital beamformers and analog beamformer, optimal power allocation can be included while searching for the Lagrange multipliers satisfying the joint sum-power and practical per-antenna power constraints.
Let Σ (1) k and Σ (2) k , ∀k ∈ U and Σ (1) j and Σ (2) j , ∀j ∈ D, be defined as Given (28), the optimal stream power allocation can be included based on the result stated in the following.

Lemma 2. Optimal power allocation for the hybrid FD BS and multi-antenna UL users can be obtained by multiplying Σ
(1) j and Σ (2) j with the diagonal power matrix P j , ∀j ∈ D and Σ (1) k and Σ (2) k with the diagonal power matrix P k , ∀k ∈ U, respectively.
Proof. The beamformers U k and V k , are computed as the generalized dominant eigenvectors, which make the matrices j , ∀j diagonal at each iteration. Multiplying any generalized dominant eigenvector solution matrix with a diagonal matrix still yields a generalized dominant eigenvector solution. Therefore, multiplying Σ with P j , ∀j ∈ D still preserves the validity of optimized beamforming directions.
Given the optimized beamformers and fixed Lagrange multipliers, by using the result stated in Lemma 2, stream power allocation optimization problems for UL and DL users can be formally stated as max.
(29b) Solving (29) leads to the following optimal power allocation scheme where (X) + = max{0, X}. We remark that the proposed power allocation scheme is interference, SI, cross-interference and LDR noise aware as it takes into account their effect in the gradients, which are updated at each iteration. Fixed the beamformers, we can search for multipliers satisfying the joint constraints while doing water-filling for powers. To do so, consider the dependence of Lagrangian (18) on multipliers and powers as The dual function max. P L(Ψ, L, P ) is the pointwise supremum of a family of functions of Ψ, L, it is convex [58] and the globally optimal values for Ψ and L can be obtained by using any of the numerous convex optimization techniques. In this work, we adopt the Bisection algorithm to search the multipliers. Let M 0 = {λ 0 , ψ 1 , .., ψ M 0 } and M k = {λ k , ψ k,1 , .., ψ k,M k } denote the sets containing Lagrange multipliers associated with the sum-power and practical per-antenna power constraints for FD BS and UL user k ∈ U, respectively. Let µ i and µ i denote the lower and upper bound for the search range of multiplier µ i , where µ i ∈ M 0 or µ i ∈ M k . While searching multipliers and performing waterfilling for powers, the UL and DL power matrices become nondiagonal. Therefore, we consider the SVD of power matrices to shape them back as diagonal. Namely, let P i denote the power matrix for user i, where i ∈ U or i ∈ D. When P i becomes non-diagonal, we consider its SVD as where U Pi , D Pi and V Pi are the left unitary, diagonal and right unitary matrices, respectively, obtained with the SVD decomposition, and we set P i = D Pi to obtain diagonal power matrices.
For unit-modulus HYBF, the complete alternating optimization based procedure to maximize the WSR based on minorization-maximization is formally stated in Algorithm 1. For HYBF with AMs, the steps ∠G RF and ∠F RF must be omitted and amplitudes of the analog beamformer and combiner must be quantized with Q A (·). Once the proposed algorithm converges, all the combiners can be chosen as the MMSE combiners, which will not affect the WSR achieved with Algorithm 1 (4) − (9) [56].

E. Convergence
In our context, the ingredients required to prove the convergence are minorization [45], alternating or cyclic optimization [45], Lagrange dual function [58], saddle-point interpretation [58] and KKT conditions [58]. For the WSR cost function (12), we construct its minorizer as in (15a), (15b), (15c), (15d), which restates the WSR maximization as a concave problem (17) for each link. The minorizer is a touching lower bound for the original WSR problem (12), so we can write The minorizer, which is concave in T k and Q j , still has the same gradient of the original WSR and hence the KKT conditions are not affected. Reparameterizing T k or Q j in terms of U k , ∀k ∈ U and G RF or V j , ∀j ∈ D, respectively, as in (10) with the optimal power matrices and adding the power constraints to the minorizer, we get the Lagrangian (18). Every alternating update of L for V j , G RF , U k , ∀j ∈ D, ∀k ∈ U or for P , Λ, Ψ leads to an increase of the WSR, ensuring convergence. For the KKT conditions, at the convergence point, the gradients of L for V j ,G RF , U j or P correspond to the gradients of Lagrangian (12), i.e., for the original WSR problem. For fixed analog and digital beamformers, L is concave in P, hence we have a strong duality for the saddle point, i.e. max P min L,Ψ .L L, Ψ, P .

Repeat
Quantize ∠G RF and ∠F RF (|G RF | and |F RF | with AMs) Let X * and x * denote the optimal solution for matrix X or scalar x at the convergence, respectively. When Algorithm 1 converges, solution of the following optimization problem satisfies the KKT conditions for powers in P and the complementary slackness conditions where all the individual factors in the products are nonnegative, and for per-antenna power constraints Ψ * 0 and Ψ * k , the sum of non-negative terms being zero implies all terms result to be zero. Remark 3: The unit-modulus HYBF scheme converges to a local optimum where ∠G RF (m, n), ∠F RF (m, n) ∈ P with |G RF (m, n)|, |F RF (m, n)| = 1, ∀m, n. Unconstrained HYBF with AMs converges to a different local optimum, where ∠G RF (m, n), ∠F RF (m, n) ∈ P and |G RF (m, n)|, |F RF (m, n)| ∈ A, ∀m, n. Due to quantization, G RF and F RF obtained with Algorithm 1 tend to lose their optimality and consequently achieve less WSR compared to their infinite resolution case. For unit-modulus HYBF, the loss in WSR depends only on the resolution of phases. For HYBF with AMs, the loss in WSR depends on the resolution of both amplitudes and phases.

F. Complexity Analysis
In this section, we analyze the per-iteration computational complexity of Algorithm 1, assuming that the dimensions of antennas get large. Its one iteration consists in updating K and J digital beamformers for the UL and DL user, respectively, and one analog beamformer and combiner for the FD BS. One dominant generalized eigenvector computation to update analog beamformer G RF from a matrix of To update the gradientsÂ k andB k for one UL user, the complexity is given by O((K − 1)N 3 r ) and O(JN 3 j ), respectively. For the gradientĈ j andD j , required to update the beamformer of j-th DL user, computational complexity is O((J − 1)N 3 j ) and O(KN 3 r ), respectively. Updating the beamformers of k-th UL and j-th DL users as the generalized dominant eigenvectors adds additional complexity of O(u k M 2 k ) and O(v j N 2 j ), respectively. The Lagrange multipliers' update associated with the per-antenna power constraints for FD BS and UL users is linear in the number of antennas M 0 or M k , respectively. However, as we jointly perform the multipliers' search and power allocation, which can be ignored. Updating the analog combiner F RF for FD BS is O(N r N 2 0 ). Under the assumption that the dimensions of antennas get large, the per-iteration complexity is ≈ O(K 2 N 3 r +KJN 3 j +J 2 N 3 j +JKN 3 r +M 2 0 M 2 t +N r N 2 0 ) which depends on the number of UL and DL users served by the mmWave FD BS.

V. SIMULATION RESULTS
This section presents simulation results to evaluate the performance of the proposed HYBF scheme. For comparison, we define the following benchmark schemes: a) A Fully digital HD scheme with LDR noise, serving the UL and DL users with time division duplexing. Being HD, it is neither affected by the SI nor by the cross-interference. b) A Fully digital FD scheme with LDR noise. This scheme sets an upper bound for the maximum achievable gain by a hybrid FD system.
Hereafter, HYBF designs with the unit-modulus constraint and with AMs are denoted as HYBF-UM and HYBF-AMs, respectively. We define the signal-to-noise-ratio (SNR) for the mmWave mMIMO FD system as where the scalars α 0 and σ 2 0 denote the total transmit power and thermal noise variance for FD BS, respectively. We set the thermal noise level for DL users to be σ 2 0 = σ 2 j , ∀j, and the transmit power for UL users as α 0 = α k , ∀k. We consider the total transmit power normalized to 1 and choose the noise variance based on desired SNR. To compare the gain of a FD system over a HD system, we define the additional gain in percentage as where W SR F D and W SR HD denote the WSR of a FD and HD system, respectively. To evaluate the performance, we set the per-antenna power constraints for FD BS and UL users as the total transmit power divided by the number of antennas, i.e. α 0 /M 0 I and α k /M k I, ∀k. The BS and users are assumed to be equipped with a uniform linear array (ULA) with antennas separated by half-wavelength. The transmit and receive antenna array at the BS are assumed to be placed D = 20 cm apart, with the relative angle Θ = 90 • , and r m,n is modelled as (9) [23]. The Rician factor κ for the SI channel is set to be 1. We assume that the FD BS has M 0 = 100 transmit and N 0 = 50 receive antennas. It serves two UL and two DL users with M k = N j = 5 antennas and with 2 data streams for each user. The phases for both designs are quantized in the interval [0, 2π] with an 8-bit uniform quantizer Q P (·). For HYBF with AMs, the amplitudes are uniformly quantized with a 3-bit uniform quantizer Q A (·) in the interval [0, a max ], where a max = max{|max{G RF }|, max{|F RF |}} is the maximum of the maximum modulus of G RF or F RF . We assume the same LDR noise level for the users and FD BS, i.e. k 0 = β 0 = κ k = β j . The rate weights for the UL   Table II. The digital beamformers are initialized as the dominant eigenvectors of the channel covariance matrices of the intended users. Analog beamformer and combiner are initialized as the dominant eigenvectors of the sum of channel covariance matrices across all the UL and DL users, respectively. Note that as we assume perfect CSI, the SI can be cancelled with HYBF only up to the LDR noise level, which represents the residual SI. Figure 4 shows the achieved average WSR with the proposed HYBF designs as a function of the LDR noise with SNR = 0 dB. The fully digital FD scheme achieves an additional gain of ∼ 97% over a fully digital HD scheme. The impact of different LDR noise levels on the maximum achievable WSR for a mmWave FD system with different number of RF chains is also shown. For k 0 ≤ −40 dB, HYBF-UM and HYBF-AMs achieve an additional gain of ∼ 85%, 64%, 42%, 3% and ∼ 89%, 74%, 60%, 28% with 32, 16, 10, 8 RF chains, respectively. We can see that as the LDR noise variance increases, achievable WSR for both the hybrid FD and fully digital HD system degrades severely. Figure 5 shows the achieved average WSR as a function of the LDR noise with SNR = 40dB. For k 0 ≤ −80 dB, HYBF-UM and HYBF-AMs achieve an additional gain of ∼ 65%, 55%, 41%, 15% and ∼ 67%, 62%, 55%, 26% with 32, 16, 10, 8 RF chains, respectively, and increasing the LDR noise variance degrades the achieved average WSR. By comparing Figure 4 with Figure 5, we can see that at low SNR, HYBF-UM with only 8 RF chains performs close to the fully digital HD scheme. As the SNR increases to 40 dB, HYBF-UM with 8 RF achieves an additional gain of ∼ 15%. HYBF-AMs with only 8 RF chains outperforms the fully digital HD scheme for all the SNR levels.  show that HYBF-AMs with 10 RF chains achieves similar average WSR as the HYBF-UM with 16 RF chains. It is interesting to observe that increasing the SNR from 0 dB to 40 dB decreases the thermal noise variance and the LDR noise variance dominates the noise floor already with k 0 = −80 dB at SNR= 40 dB. For SNR= 0 dB, the LDR noise variance dominates only for k 0 > −40 dB. From this observation, we can conclude that hardware with a low LDR noise is required to benefit from a high SNR in the mmWave FD systems. Figure 6 shows the average WSR with a low LDR noise level κ 0 = −80 dB with 32, 16, 10 and 8 RF chains as a function of the SNR. Both the proposed designs perform very close to the fully digital FD scheme with 32 RF chains. HYBF-UM and HYBF-AMs outperform the fully digital HD scheme with only 8 RF chains at high SNR and at any SNR level, respectively. It is evident the advantage of AMs, which add additional gain for all the SNR levels when the number RF chains at the FD BS is small. With a high number of RF chains, digital beamforming has enough amplitude  manipulation liberty to manage the interference and adding AMs does not bring further improvement. Figure 7 shows the average WSR achieved with a moderate LDR noise level κ 0 = −60 dB. We can see that for a low SNR, the achieved average WSR results to be similar as reported in Figure 6. At high SNR, the LDR noise variance starts dominating, which leads to less achieved average WSR compared to the case of Figure 6. Figure 8 shows the achieved WSR as a function of the SNR with a very large LDR noise variance of κ 0 = −40 dB. By comparing the results reported in Figure  8 and Figures 6-7, we can see that the LDR noise variance dominates for most of the considered SNR range. For a very low SNR, the achieved WSR is similar as reported in Figures  6-7. However, as the SNR increases, it does not map into higher WSR. It is clear that the maximum achievable WSR with κ 0 = −40 dB saturates already at SNR= 20 dB for both the HD and FD systems. Further improvement in the SNR does not dictate into higher WSR. When the LDR noise variance dominates, it acts as a ceiling to the effective received-signalto-LDR-plus-thermal-noise-ratio (RSLTR). The transmit and receive LDR noise variance is proportional to the total transmit power per-antenna and received power per RF chain after the analog combining, respectively. When the LDR noise variance is large, the thermal noise variance has a negligible effect on the effective RSLTR. Consequently, a decrease in the thermal noise variance (increasing SNR) does not dictate a better WSR. Figure 9 shows the achievable performance of HYBF-UM and HYBF-AMs as a function of the RF chains with SNR= 20 dB, in comparison with the benchmark schemes, with very high and very small LDR noise levels. In particular, with very high LDR noise k k = −40 dB and 8 RF chains, HYBF-UM and HYBF-AMs perform close to the fully HD system, and an increase in the number of RF chains improves the performance, which tends towards the achieved WSR by a fully digital FD system with LDR noise level k k = −40 dB. Similar behaviour can be observed for the case of low LDR noise k k = −80 dB. Both the proposed schemes achieve higher WSR with the same number of RF chains in the latter case. We can also see that AMs add additional gain with a low number of RF chains, and as the number of RF chains increase, the gap in the achievable WSR with HYBF-AMs and HYBF-UM closes. In particular, with 32 RF chains, the difference in the WSR with or without AMs becomes negligible. From the results reported in Figures 4-9, we can conclude that the proposed HYBF schemes achieve significant performance improvement, in terms of average WSR, compared to a fully digital HD system. LDR noise plays a key role in determining the maximum achievable WSR for both the FD and HD systems. Figures 4-5 shows how an increase in the LDR noise variance degrades the average WSR at low and high SNR levels. Figures 6-7 shows that with a large to moderate dynamic range, the LDR noise degrades the performance only at very high SNR. Figure 8 shows the achieved WSR as a function of a very large LDR noise variance. In that case, it is observed that the WSR saturates at SNR= 20 dB and further improvement in the SNR does not dictate higher WSR. From Figure 9, it is clear how the number of RF chains at the mmWave FD BS affects the achievable WSR with different LDR noise levels and with or without the AMs.

VI. CONCLUSION
This paper has presented a novel HYBF design to maximize the WSR in a single-cell mmWave FD system with multiantenna users and suffering from LDR noise. The beamformers were designed under the joint sum-power and the practical per-antenna power constraints. Simulation results showed that the multi-user mmWave FD systems can outperform the fully digital HD system with only a few RF chains. The advantage of having amplitude control at the analog processing stage is also investigated, and the benefit resulted to be evident with a small number of RF chains. Achievable average WSR with different levels of the LDR noise variance is also investigated, and the proposed HYBF designs outperformed the fully digital HD system at any LDR noise level.

APPENDIX A GRADIENT DERIVATION
The proof of Theorem 1 is based on the result derived in the following.
The derivative of lndet Y with respect to X is given by Proof. By substituting φ = lndet(Y ), we can write where operator : denotes the Frobenius inner product, i.e. G RF : H = Tr G H RF H . Its derivative with respect to X can be written as where the last term results to be zero as independent from X. Substituting the Forbenius product with the trace operator, using its cyclic shift and separating terms, yields where the last term being independent of X is also zero. To proof the aforementioned result, we proof the derivatives of I, II and III separately. Firstly, for I, by using : and doing some simple algebric manipulations leads to To obtain the derivative of II, we first define diag X = Z. The diagonal of X can be written as diag X = I • X where • denotes the Hadamard product. By writing II with : and expressing the diagonal term as a function of •, and using the commutative property of the Hadamard product leads to the following result To compute the derivative of III, we first define diag CXD = W . By using a similar approach as in (45), Combining the result from each term concludes the proof for Lemma (40).
To prove Theorem 1, note that the covariance matrices in 11 has a special (Hermitian) structure, i.e., B = A H and D = C H . Therefore, the result of Lemma 3 for this particular case is given in the following. Lemma 4. Let Y = AXB + a Adiag X + Q B + b diag CXD + E + F , where the size of matrices involved is such that the product is valid. Let B = A H and D = C H and the derivative of lndet(Y ) is given by Proof. The result follows directly by relying on the result given in Lemma 3 by substituting B = A H and D = C H Proof. Theorem 1 To prove the gradients to linearize the WSR with respect to T k and Q j , we proceed by simplifying the WSR as The WSR U L k and WSR DL should be linerized for T k and WSR DL j and WSR U L for Q j . Note from (11) that T k appears in WSR U L k and WSR DL with the structure Y = AXA H +a A diag X +Q A H +b diag CXC H +E +F , where the scalars a and b are due to the LDR noise model, A and C are the interfering channels, F and E contain the noise contributions from other transmit covariance matrices but independent from T k . The same structure holds also for the DL covariance matrices Q j , ∀j ∈ D. By applying the result from Lemma 4 with Y = R k or Y = R k repetitively K − 1 time for linearizing WSR k with respect to T k yield the gradient A k . Similarly, by considering Y = R j or Y = R j , ∀j ∈ D and applying the result from Lemma 4 yield the gradient B k .
The same reasoning holds also for Q j , which leads to the gradientsĈ j and D j by applying the result provided in Lemma 4 for WSR DL j J − 1 times and for WSR U L K times, respectively, ∀j ∈ D.

APPENDIX B PROOF OF THEOREM 3
The dominant generalized eigenvector solution maximizes the reformulated concave WSR maximization problem To prove Theorem 3 for solving (49), we first consider the UL digital beamforming solution by keeping the analog beamformer and the digital DL beamformers fixed. We proceed by considering user k ∈ U for which we wish to compute the WSR maximizing digital UL beamformer. The same proof will be valid ∀k ∈ U. The proof relies on simplifying max.
until the Hadamard's inequality applies as in Proposition 1 [57] or Theorem 1 [59]. The Cholesky decomposition of the matrix Â k +B k +l k +Ψ k ) is given as L k L H k where L k is the lower triangular Cholesky factor. By definingŨ k = L H k U k , (50) reduces to max.
By Hadamard's inequality [Page 233 [60]] , it can be seen that the optimal O k must be diagonal. Therefore, from which we select u k dominant eigenvectors, which concludes the proof for the UL beamformer for user k ∈ U. For the digital DL beamformers the proof follow similarly by considering the following optimization problem ∀j max.
and simplifying it until the Hadamard's inequality applies to yield a similar result as expressed in (53). The proof for analog beamformer G RF does not apply directly as the KKT condition have the form A 1 G RF A 2 = B 1 G RF B 2 , which are not resolvable. To solve it for the analog beamformer G RF , we apply the result vec(AXB) = B T ⊗ Avec(X) [61], which allows to rewrite (22) as The WSR maximizing analog beamformer can alternatively be derived as follows (which allows the proof for the digital beamformers to be applicable directly). First we apply a noise whitening procedure using the noise plus interference covariance matrix R 1/2 j on the received signal. Further, we can rewrite the whitened signal as follows where y j = R −1/2 j y j and n j represents the whitened noise plus interference signal. We can write the resulting WSR optimization problem, after the approximation to concave form and some algebraic manipulations on the linearized term, as max G RF j∈D Taking the derivative of (57) for the conjugate of G RF leads to the same generalized eigenvector solution as in (23). Note that this alternative representation has the same form as (50), which is resolvable for the vectorized version of the analog beamformer G RF . Therefore, the proof for the UL and DL digital beamformers can now be applied directly on the vectorized analog beamformer vec(G RF ), which is summed over all the DL users served by the mmWave FD BS.