A Deep Learning-Assisted Cooperative Diversity Method under Channel Aging

—Single-relay selection is a simple but efﬁcient scheme for cooperative diversity among multiple user devices. However, the wrong selection of the best relay due to aged channel state information (CSI) remarkably degrades its performance, overwhelming this cooperative gain. Multi-relay selection is robust against channel aging but multiple timing offset (MTO) and mul- tiple carrier frequency offset (MCFO) among spatially-distributed relays hinder its implementation in practical systems. In this paper, therefore, we propose a deep learning-based cooperative diversity method coined predictive relay selection (PRS) that chooses a single relay with the largest predicted CSI, which can alleviate the effect of channel aging while avoiding MTO and MCFO. Performance is evaluated analytically and numerically, revealing that PRS clearly outperforms the existing schemes with a negligible complexity burden.

Abstract-Single-relay selection is a simple but efficient scheme for cooperative diversity among multiple user devices. However, the wrong selection of the best relay due to aged channel state information (CSI) remarkably degrades its performance, overwhelming this cooperative gain. Multi-relay selection is robust against channel aging but multiple timing offset (MTO) and multiple carrier frequency offset (MCFO) among spatially-distributed relays hinder its implementation in practical systems. In this paper, therefore, we propose a deep learning-based cooperative diversity method coined predictive relay selection (PRS) that chooses a single relay with the largest predicted CSI, which can alleviate the effect of channel aging while avoiding MTO and MCFO. Performance is evaluated analytically and numerically, revealing that PRS clearly outperforms the existing schemes with a negligible complexity burden.

I. INTRODUCTION
Cooperative diversity [1] is an effective technique to achieve spatial diversity as same as multi-input multi-output (MIMO), through the collaboration among multiple single-antenna nodes, when there is no possibility of embedding an antenna array on a mobile terminal. A main difference between MI-MO and cooperative diversity is the inherent asynchronization among spatially-distributed relays in the latter. Multiple timing offset (MTO) [2] and multiple carrier frequency offset (MCFO) [3] among simultaneously transmitting relays make multirelay selection methods, such as distributed beam-forming [4] and distributed space-time coding (DSTC) [5], hard to implement for practical systems. In contrast, a single-relay selection approach called opportunistic relay selection (ORS) has been extensively recognized as a simple but efficient way to achieve cooperative diversity [6]. Despite only a single node is opportunistically selected to retransmit, identical performance as all-participating strategy using DSTC is expected, while avoiding the need on multi-relay synchronization.
However, channel state information (CSI) used to select the best relay may differ from the actual CSI due to feedback delay. Retransmitting signals on a wrong relay selected in terms of aged CSI substantially deteriorates the performance of ORS [7]- [9]. To remain cooperative diversity under channel aging, Generalized Selection Combining (GSC) [10] and its enhanced version called N plus normalized threshold GSC (N+NT-GSC) [11] have been proposed. But these schemes require at least N orthogonal channels to retransmit, resulting in around 1/N spectral efficiency. In [12], one author of this paper proposed a scheme called opportunistic space-time coding (OSTC) that alleviates the effect of aged CSI but avoids the decrease of spectral efficiency. By far, to the best knowledge of the authors, OSTC can achieve the best result under channel aging, but its gap to the optimal performance is still large, which motivates the work in this paper.
Recently, a technique referred to as channel prediction [13], [14], which can improve the timeliness of CSI by forecasting future CSI in advance, attracts the attention of researchers. In this paper, leveraging its capability on time-series prediction, a deep recurrent neural network with Long Short-Term Memory (LSTM) [15] is employed to build a channel predictor. Upon this, we propose a novel cooperative diversity method coined predictive relay selection (PRS). Its key idea is to choose a single relay (in order to avoid MTO and MCFO in multirelay selection) with the largest predicted CSI, earning a prediction horizon to counteract induced delay. A closed-form expression of outage probability for PRS is derived and then verified by simulations. Performance evaluation reveals that it clearly outperforms the existing schemes, without bring complexity burden. The rest of this paper is organized as follows: Section II introduces the system model. Section III and IV present the proposed scheme and analyze its outage probability, respectively. Numerical results are given in Section V. Finally, Section VI concludes this paper.

A. Model of Cooperative Networks
Consider a two-hop decode-and-forward (DF) cooperative network where a source s communicates with a destination d with the help of K relays, neglecting the direct link due to lineof-sight blockage. The received signal in link A→B is modeled as y B =h A,B x A + z B , where x A ∈ C is the transmitted symbol from Node A with average power P A =E[|x A | 2 ] (E denotes the expectation operator), h A,B represents channel coefficient that is a zero-mean circularly-symmetric complex Gaussian random variable with variance σ 2 h , i.e., h∼CN (0, σ 2 h ), under Rayleigh flat-fading channels, and z B stands for additive white Gaussian noise with zero-mean and variance σ 2 n , i.e., z∼CN (0, σ 2 n ). The instantaneous signal-to-noise ratio (SNR) of link A→B is denoted by γ A,B =|h A,B | 2 P A /σ 2 n and the average SNR γ A,B =E[γ A,B ]=σ 2 h P A /σ 2 n . Node A can be the source A=s or k-th relay A=k, k∈{1, ..., K}, corresponding to B=k or B=d, respectively.
Because of severe signal attenuation, the relays with a single antenna should operate in half-duplex transmission mode to avoid harmful self-interference between the circuits of transmitter and receiver. Without loss of generality, time-division multiplexing is applied for analysis hereinafter and therefore the signal transmission is organized into two phases. In the first phase, as shown in Fig.1, the source (e.g., the drone in the figure) transmits a signal and those of relays which can correctly decode this signal form a decoding subset (marked by DS) of source-relay link where R is the end-to-end target rate for the two-hop cooperative network. Note that the required rate for either link raises to 2R due to the half-duplex mode. The best relayk in the conventional ORS is opportunistically selected from DS in terms ofk = arg max k∈DSγk,d , wherê γ k,d is the SNR of relay-destination link at the instant of relay selection, which is an outdated version of γ k,d at the time of actual signal transmission. In comparison, the proposed PRS scheme replaces the aged CSI with the predicted CSǏ h, and determinesk in terms ofk = arg max k∈DSγk,d , whereγ k,d =|ȟ k,d | 2 P k /σ 2 n . In our notation, h is actual CSI, h denotes aged CSI, andȟ means predicted CSI. In addition to the best relay, OSTC needs to select another relay with the second strongest SNR, i.e.,k = arg max k∈DS−{k}γ k,d . In the first phase, the source broadcasts a pair of symbols (x 1 , x 2 ) to all relays on two consecutive symbol durations. The regenerated signals are encoded by means of the Alamouti scheme, a unique space-time code achieving both full-rate and     full-diversity, at the pair of selected relays. In the second phase, a relay transmits (x 1 , −x * 2 ) while another sends (x 2 , x * 1 ) simultaneously at the same frequency.

B. Model of Aged CSI
From a practical point of view, the CSIĥ used to select relay(s) may remarkably differ from the actual CSI h at the instant of using the selected relay(s) to forward regenerated signals, leading to performance deterioration. To quantify such CSI inaccuracy, the correlation coefficient between h andĥ is introduced, i.e., According to [16], we havê where ε∼CN (0, 1) and σ 2 h is the variance ofĥ. Under the assumption of a Jakes' model, the correlation coefficient takes where f d is the maximal Doppler frequency, τ stands for the delay between the outdated and actual CSI, and J 0 (·) denotes the zeroth order Bessel function of the first kind.

C. Model of Predicted CSI
To train a deep learning (DL) predictor, the applied objective is to generate predicted CSIȟ that approximates to the actual CSI (zero-mean complex Gaussian random variable) as close as possible. Hence, we can assume thatȟ also follows zeromean complex Gaussian distribution, i.e.,ȟ∼CN (0, σ 2 h ). The relationship betweenȟ and h can be modeled aš where e is the prediction error that is zero-mean complex Gaussian variable with variance σ 2 e . Like (2), the correlation coefficient betweenȟ and h can be obtained. Replacingĥ witȟ h and substituting (4) into (2), yields In the field of machine learning (ML), normalized mean squared error (NMSE) is an usual metric applied to measure the accuracy of data fitting, which can be easily acquired during both the training and predicting phase. In our case of channel prediction, the NMSE is and it can be straightforward derived that the NMSE is related to e by N M SE = σ 2 e /σ 2 h . The model-less ML techniques make traditional statistics-based performance analysis intractable, but the availability of NMSE provides another method for performance evaluation.
The actual CSI h and its predicted versionȟ follow joint complex Gaussian distribution. Then, the instantaneous SNR of relay-destination link γ k,d conditioned onγ k,d follows noncentral Chi-square distribution with two degrees of freedom. Substituting (5) into Eq. (12) of [17], the probability density function (PDF) in terms of σ 2 e is obtained, that is whereγ k,d means the average SNR of relay-destination link, and I 0 (·) denotes the zeroth order modified Bessel function of the first kind.

III. PREDICTIVE RELAY SELECTION
This section introduces the principles of deep learning with LSTM and the corresponding channel predictor, analyzes its computational complexity, and then depicts the protocol design to implement predictive relay selection.

A. DL-based Channel Predictor
Unlike feed-forward neural networks, recurrent neural networks (RNNs) can memorize historical information in its internal state, exhibiting great power in time-series prediction. But back-propagated error signals in RNN tend to infinity (gradient exploding), resulting in oscillating weights, or apt to zero (gradient vanishing) that implies a prohibitively-long training time. To this end, Long Short-Term Memory were proposed by Hochreiter and Schmidhuber in their pioneer work of [15], where special units called memory cells and multiplicative gates that control information flow are introduced into the RNN structure. Each LSTM memory cell contains three gates: an input gate protecting the memory contents from perturbation by irrelevant interference, a forget gate to filter out useless memory, and an output gate that controls the extent to which the memory information applied to generate an output activation. Despite of its short history, LSTM has been successfully applied to popular commercial products such as Apple Siri and Google Translate.
The upper part of Fig.2 shows a deep LSTM network consisting of an input layer, multiple hidden layers, and an output layer. At time t, the instantaneous CSI h[t] is acquired at the receiver through estimating a pilot symbol. Because the relay selection relies on the value of SNR, only realvalued amplitude |h[t]| is enough, rather than complex-valued h[t], which in turn can simplify the implementation of neural network by using real-valued weights. Feeding |h[t]| into the input feed-forward layer to get an intermediate activation d (1) t , further activating the memory cells in the first hidden layer. Along with the recurrent unit from the previous time step, d (2) t is generated and then forwarded to the second hidden layer. This recursive process continues until the output layer gets the predicted CSI |ȟ[t+1]|. As illustrated in the lower part of Fig.2, a memory block has two internal states: the short-term state and the long-term state. At the l th hidden layer, the short-term where W and U represent weight matrices for the FC layers, b denotes bias vector, the subscripts f , i, and o associate with the forget, input, and output gate, respectively, and δ g represents the Sigmoid activation function δ g (x) = 1 1+e −x . Besides, there is an intermediate element where δ h is the hyperbolic tangent (tanh) function δ h (x) = e 2x −1 e 2x +1 . Traversing the block, the previous long-term state c (l) t−1 first discards some outdated memories at the forget gate, onboards new information selected by i (l) t , and then transforms into c t , where ⊗ denotes the Hadamard product (element-wise multiplication) for matrices. Further, c (l) t goes through the tanh function and then is filtered by o (l) t to update the short-term memory, which serves also as the output activation, i.e., s

B. Computational Complexity
The computational complexity brought by deep learning is a general concern. Here, let's assess the predictor's complexity through calculating the number of complex multiplications. The applied deep recurrent network can be quantified as follows: an input layer with n i neurons, an output layer with n o neurons, and L hidden layers, which has n l c LSTM cells at layer l=1, . . . , L. According to [14], the number of parameters including both weights and biases can be computed by: Under the typical stochastic gradient descent training, each parameter requires O(1) at each time step. Consequently, the complexity per time step in the training phase is measured by O(N DL ). During the predicting phase, each weight requires one complex-valued multiplication, amounting to the complexity of O(N DL ) per prediction.

C. Predictive Relay Selection
The implementation of cooperative relaying schemes can be mainly divided into two categories: distributed [6] and centralized. The former relies on a timer at each relay, and applies a contention period (CP) to determine the best relay. The latter has a centralized controller, e.g., the destination, which collects global CSI, makes the selection decision, and informs the selected relays to retransmit. The information exchange between the controller and the relays not only requires extra signaling, but also brings the feedback delay that exacerbates the aged CSI problem. By introducing channel prediction, the CSI got at the current frame is applied to generate predicted CSI that will be used at the next frame, such a prediction horizon provides a new degree of freedom to design a relaying protocol. Here, we depict a distributed implementation for predictive relay selection, as follows: 1) At frame t, as shown in Fig.3, the source broadcasts a packet containing a pilot called Ready-To-Send (RTS) and data payload. The channel gain h s,k [t] is acquired at relay k by estimating RTS and is used for detecting the data symbols. Those relays which correctly decode the source signal comprise DS and will participate in the relay selection process.

4) Then, a timer with a duration inversely proportional tǒ
h k,d [t] is started at relay k. 5) The timer on the relay with the largest channel gain expires first, which sends a short packet to announce. 6) Once receive the best relay's packet of its presence, other relays terminate their timers and keep silent. The selected relay retransmits the regenerative signal until the end of this frame. It is possible that the number of relays in DS is zero or the duration of timer is too long due to a very small channel gain. To deal with these anomalies, a maximal duration is required to set for CP. If this duration expires, the relay selection process is interrupted regardless of the presence of the best relay.

IV. OUTAGE PROBABILITY ANALYSIS
In information theory, outage is defined as the event that instantaneous channel capacity falls below a target rate R, where reliable communication cannot be realized whatever coding used. The metric to measure the probability of outage is referred to as outage probability that is defined by P (R)=P {log 2 (1 + γ) < R}, where P is the notation of mathematical probability. Let DS L denotes the set of all decoding subsets having L relays, and DS p L denotes p th element of DS L , namely, DS L ={DS p L |p=1, ..., |DS L |}, where | · | represents the cardinality of a set. Then, the outage probability of PRS can be calculated by where P(DS p L ) is the occurrence probability of DS p L , and P(R|DS p L ) is the outage probability conditioned on DS p L . Suppose that all source-relay links are independent and identicallydistributed (i.i.d.) Rayleigh channels, the values of P(DS p L ) are the same for any p∈{1, ..., |DS L |}, and as well P(R|DS p L ) if all relay-destination channels are i.i.d. Then, (13) can be simplified to where P(|DS|=L) denotes the probability that the number of relays in decoding subset is L. In Rayleigh channels, the instantaneous SNR of each source-relay channel is exponentially distributed, i.e., γ s,k ∼EXP 1 γ s,k , whose Cumulative Distribution Function (CDF) can be expressed by According to (1), the probability that a relay falls into DS equals to 1−F γ s,k (γ o ), where γ o =2 2R −1 is the threshold SNR corresponding to the target rate R. The probability of successfully decoding L out of K relays follows Binomial distribution, we have Thus, the second term in (14) is determined. Let's turn to the first term P (R||DS| = L), which is derived, conditioned on the number of L, as follows: a) L=0: In the case that no relay can decode the source's signal, the relaying will definitely fail, i.e., P(R||DS| = 0) = 1 (17) b) L=1: Only a unique relay successfully decodes the signal, it becomesk directly and a process of relay selection is skipped. Similar to (15), the CDF of SNR over this relaydestination link is given by F γk ,d (x)=1−e −x/γ k,d . The outage probability conditioned on L=1 is equal to c) L>1: In this case, a relay is opportunistically selected from the decoding set according to the predicted CSI in relaydestination links. For the sake of mathematical tractability, we further rewriteγ k,d , k∈DS L asγ l , l∈{1, ..., L}. Defining Ak as the event that: which means that Ak is a set of L elements (γ 1 , ...,γ L ) whereγk is the largest. However,γk is only for selection, the post-processing SNR for performance evaluation should be the actual SNR γk, whose CDF can be calculated by where P(Ak) denotes the occurrence probability of Ak, equaling to 1 L since each relay has the same chance to be selected under i.i.d channel assumption. P(γk y|Ak) notates the probability that the actual SNR is below a threshold y conditioned on Ak, which can be computed by: where f γk|γk (γ|γ) stands for the PDF of γk conditioned on its predicted versionγk, which is already given in (7). fγ˙k |Ak (γ) denotes the PDF of the largest predicted SNR conditioned on Ak, analogue to multi-user selection with a max-SNR scheduler [18], we can write it as: Substituting (7) .
Thus, the conditional outage probability at L > 1 is If setting L=1 in (24), we can get a result equaling to (18), thus, (24) can be extended to cover the case of L=1. Now, the closed-form expression for the first term in (14) is available. Substituting (16), (17), and (24) into (14), the overall outage probability of PRS in the presence of aged CSI can be computed as V. NUMERICAL RESULTS In this section, we make use of Monte-Carlo simulations to validate the correctness of analytical analyses and evaluate performance. Given i.i.d. Rayleigh channels with a normalized gain σ 2 h = 1, outage probabilities of PRS, ORS, and OSTC in the presence of aged CSI are provided. The maximal Doppler frequency is set to f d =100Hz, emulating fast fading environment, and an end-to-end target rate of R=1bps/Hz is applied for outage calculations. Training data sets are built by sampling a series of 7500 consecutive channel response {h[t] |t=1, 2, . . . , 7500 }, with and without considering the impact of noise in channel estimation. The cooperative network has K=4 DF relays and equal power allocation among nodes is used. Assuming the end-to-end power is P , the source transmits with P s =0.5P , resulting in an average SNR γ s,k =0.5P/σ 2 n for source-relay channels, whileγ k,d =0.5P/σ 2 n for relay-destination channels. Detailed simulation parameters are summarized in Table I. As illustrated in Fig.4, the markers indicating the numerical results fall into their corresponding curves that are the analytical results, corroborating our theoretical analyses in this paper. As the benchmark, the curve of ORS when the knowledge of  Fig. 4. Outage probabilities of PRS, ORS, and OSTC as a function of the end-to-end average SNRγ = P/σ 2 n with K=4. Analytical results of ORS and OSTC are derived from (20) in [12], and that of PRS is from (25).
CSI is prefect, i.e., ρ o =1, is plotted as the optimal performance that achieves the diversity of d=4 and its outage probability decays at a rate of 1/γ 4 in high SNR. Given the frame length of 2ms and f d = 100Hz, we can use (2) to figure out the correlation coefficient of aged CSI ρ o = J 0 (0.4π) ≈ 0.6425. As we can see, channel aging substantially deteriorates the performance, where the diversity of ORS falls into 1, i.e., no diversity, and the curve decays slowly at a rate of 1/γ in high SNR. Although OSTC can redeem some loss with diversity of 2 by using a pair of relays, its gap to the optimal performance is still large, amounting to around 3dB at the level of 10 −2 . For PRS, we observe two different types of data for training the predictor: the ideal case adopts perfect CSI, and a practical case where data is estimated CSI at the SNR of 30dB. The former achieves sub-optimal performance with NMSE of σ 2 e = 0.012, corresponding to ρ p = 0.994 in terms of (5), which is obviously better than aged CSI with ρ o = 0.6425. The noise degrades the performance slightly with σ 2 e = 0.05 and ρ p = 0.9754, but it still clearly outperforms ORS and OSTC.
Last but not least, the complexity of the predictor is investigated. The applied neural network has two hidden layers with n 1 c =20 and n 2 c =10 LSTM cells, respectively, an input and output layer with a single neuron n i =n o =1, amounting to N DL = 2811 in terms of (12). It is meaningful to make clear how many computing resources are required. Given 500 times prediction per second due to the frame length of 2ms, it needs approximately 6 × 10 6 Floating Point Operations Per Second (FLOPS). In comparison with the capability of current digital signal processor, e.g., TI 66AK2x, which provides more than 10 4 Million Instructions executed Per Second (MIPS), the required computing resource is negligible (< 0.001).

VI. CONCLUSIONS
In this paper, we proposed a deep learning-based relaying method to achieve cooperative diversity. Taking advantage of time-series prediction of deep recurrent neural network, a channel predictor was built as a new degree of freedom for realizing predictive relay selection. The proposed scheme opportunistically selects a single relay with the largest predicted CSI to retransmit, which alleviates the effect of aged CSI while avoiding the problem of multi-relay synchronization. Analytical and numerical results on outage probabilities proved that it clearly outperforms opportunistic relay selection and opportunistic space-time coding under channel aging. Also, computational complexity was analyzed, revealing that its required computing resource is negligible in comparison with off-the-shelf hardware. From the perspective of both performance and complexity, it is a good candidate for practical implementation.