Deep Learning for Signal Processing with Predictions of Channel Profile, Doppler Shift and Signal-To-Noise Ratio

This paper proposes Deep Learning (DL) for Signal Processing, with reviews and discussions of the three recent DL application advancements in wireless communication, which predict channel profile, Doppler shift, and signal-to-noise ratio (SNR) of LTE and 5G systems. MATLAB simulations are performed on time-domain and frequency-domain signals, emulating real wireless environments, with randomized payloads (e.g., non-data-aided), modulation types [QPSK, 16QAM, 64QAM], Doppler shifts [0, 50, .., 550] Hz, and SNRs [-10, 20] dB. The predictions are accurate at ~95%. The methodology consists of input diversity to empower multiple inputs per prediction, binary prediction to reduce prediction complexity and uncertainty, and a hybrid DL convolutional neural network and long-short-term-memory (CNN-LSTM) model to learn features in every input and across inputs. Additionally, the paper presents common lessons learned and future research directions. The designed methodology provides an effective backup scheme for prediction accuracy enhancement upon nonperformance of the traditional single input single output, multi-class prediction scheme. Furthermore, the review aims to extend upon the methodology's success from the three applications paving the way for a universal DL prediction methodology for wireless communications (i.e., DL for Signal Processing) and other domains.


I. INTRODUCTION
In wireless communication, knowledge of the wireless channel and noises are essential for signal detection as shown (1) where y(t) is the received output signal (i.e., detected), x(t) is the transmitted input signal, h(t) is the wireless channel impulse response, n(t) is noise, and * is the convolution operator. The channel h(t) is typically estimated using the Clarke's wireless channel model (2).
α m e j(2πf D tcos(θm)+ϕm) (2) where N l is the number of multipaths, α m is the fading amplitude of the m th path, f D is the maximum Doppler frequency shift, θ m is the arrival angle of the m th path, and ϕ m is a uniformly distributed random phase. As a result, channel profile, Doppler shift and noises are fundamental parameters for optimal signal detection in wireless communication.
In practice, for simplicity, estimation of channel state information (CSI) is primarily performed in place of the channel profile. CSI provides average channel information in time, without delays and gains of multipath components (MPCs). Additionally, the Doppler shift estimation is not required as it is embedded in the channel estimation. However, the availability of the accurate channel profile and the Doppler shift would enhance signal detection performance and produce more efficient and productive wireless communications such as higher data rate via higher signal modulation and less interference via lower signal power.
Traditional estimations of CSI, Doppler shift, and noise are based on statistical estimation techniques. With the advancement of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) in various fields, most notably in Computer Vision (CV) and Natural Language Processing (NLP), ML/DL have been applied for parameter predictions in the physical layers [5]- [7], modulation recognition [9], power allocation [8], channel estimation [10] and signal detection [11]. However, majority of ML/DL applications used simplistic ML models or single input single output, multi-class DL models such as convolutional neural networks (CNN). Additionally, simulations were limited to generic wireless communication systems (i.e., not LTE or 5G). Furthermore, signal data were generated based on unrealistic wireless environments, without considerations for comprehensive variation impacts of channel profile, Doppler shift, or SNR.
Recently, three DL applications that predict the channel profile, the Doppler shift, and signal-to-noise ratio (SNR) have been reported [1]- [4]. The enabling methodology underlined the three applications consists of a hybrid DL CNN-LSTM model, input diversity, and binary prediction. This DL methodology is proven effective across the three applications in simulated real-world wireless conditions. This review aims to extend on the methodology's success as a universal DL prediction methodology for parameter predictions in wireless communications (i.e., DL for Signal Processing) and potentially in other domains. Follows are section II describing the methodology, section III detailing the simulations, section IV summarizing lessons learned, section V identifying future research, section VI discussing DL for Signal Processing, and finally section VII presenting the conclusion.

II. THE METHODOLOGY
This section describes the DL prediction methodology.

A. Input diversity
Conventional DL prediction techniques are single input single output, using one input per prediction. Input diversity promotes using a sequence of inputs for each prediction to increase prediction accuracy via additional inter-input representation learning. Long-Short-Term-Memory neural networks (LSTM) effectively predict applications with time-series data or inputs with temporal diversity, where multiple inputs are sequentially collected and fused for each prediction. LSTM can be extended for prediction inputs with spatial diversity, where multiple inputs are collected from several sources. For the channel profile prediction, spatial and temporal diversity can be utilized to learn inter-input features in every sequence of inputs collected from multiple antennas simultaneously (e.g., MIMO) and multiple times sequentially on each antenna (i.e., multiple times) slots), respectively.

B. Binary prediction
Traditional DL prediction techniques are multi-class, using one predictor to predict a label from multiple labels. Binary prediction is a divide-and-conquer algorithm, where multiple predictors are used to reduce complexity, uncertainty and therefore enhance prediction accuracy. Traditional DL prediction is multi-class, using one predictor to predict a label from a list of labels. Binary prediction is single-class, using multiple predictors, each of which predicts only one label or a range of labels relative to a target label (i.e., TDL-C or not). Each binary predictor targets a unique channel profile in the channel profile list. For instance, the 5G TDL-C binary channel profile predictor predicts either TDL-C or not. An ensemble of three binary channel profile predictors is used to cover the LTE channel profile list [EPA, EVA, ETU] and five to cover the 5G channel profile list [TDL-A, TDL-B, TDL-C, TDL-D, TDL-E].

C. Convolutional Neural Network and Long Short Term Memory (CNN-LSTM) Model
Deep learning-based classifications or predictions commonly use convolutional neural network (CNN) models since they are effective and efficient at learning spatial features within every input. However, for time-series data where a sequence of inputs is used for each prediction, long short term memory (LSTM) models are typically used to learn temporal features across inputs in each input sequence. LSTM has been used to predict noise power in optical fiber communication systems [13]. We propose a hybrid model CNN-LSTM to learn spatial and temporal features within each input and across multiple inputs.

III. SIMULATIONS
This section reviews and summarizes the simulations of the three DL applications that utilized the DL methodology. More details of the dataset generation and the CNN-LSTM model are available in individual papers [1]- [4]. The MAT-LAB codes are available at https://github.com/thinhngo11/DL-Signal-Processing.

A. Dataset Generation
Computer simulations are often used for the validation of wireless channel parameter estimation techniques. MATLAB Communication, LTE, and 5G toolboxes are used for wireless fading channel modeling, and RF received signal dataset generation. To emulate real-world wireless environments and for prediction robustness, every input is parameterized with randomly selecting a modulation type in [QPSK, 16QAM, 64QAM], a Doppler shift in [0, 50, ..., 550] Hz, an SNR between [-10 20] dB, and a channel profile in [EPA EVA ETU] for the LTE system and in ['TDL-A', 'TDL-B', 'TDL-C', 'TDL-D', 'TDL-E'] for the 5G system. Prediction input dataset (i.e., payload) are randomized (e.g., non-data-aided -NDA) for spectral efficiency. For input diversity, the dataset is reorganized where multiple inputs of like ground truth labels are combined into input sequences of desired lengths (i.e., 1, 2, ..., 5, 10, 15). Depending on the prediction target (i.e., channel profile, Doppler shift, or SNR), labels are assigned accordingly to every input sequence. This study is conducted on traditional single-antenna systems; however, it can readily be extended to multi-input, multi-output (MIMO) systems.
1) LTE dataset Generation: MATLAB example "LTE Downlink Channel Estimation and Equalization" is adopted. The dataset consists of 54,000 inputs, each of which is a grid consisting of 6 resource blocks and one subframe (e.g., 14 OFDM symbols), including synchronizing and reference signals, and containing 1,920 time-domain complex-valued samples or 1,008 frequency-domain complex-valued samples.
2) 5G dataset Generation: MATLAB example "Deep Learning Data Synthesis for 5G Channel Estimation" is adopted. The dataset consists of 12,000 inputs, each of which is a grid consisting of 51 resource blocks and one slot (e.g., 14 OFDM symbols), including PDSCH DM-RS precoding/mapping, and containing 15,376 time-domain complexvalued samples or 8,568 frequency-domain complex-valued samples.

B. CNN-LSTM Model Construction
MATLAB Deep Learning toolbox is used to construct and simulate the model. We adopt the CNN model from MATLAB example "Modulation Classification with Deep Learning" and incorporate an LSTM model. In real-world applications, training can be done offline and continually in-service using real RF signals.

C. Prediction Results
Computer simulations of channel profile, Doppler shift, and SNR predictions in LTE and 5G systems are reviewed and discussed in this section. Like DL applications in other domains (i.e., computer vision), DL prediction performances, including accuracy pattern, mean, and variance, are data-driven and attained intuitively and empirically (e.g., trial-and-error). Table I summarizes the prediction average accuracy of the traditional multi-class prediction (M), and binary prediction (B), time-domain and frequency-domain signal inputs, over input diversity for LTE and 5G systems. The methodology is proven essential and effective to achieve the prediction accuracy 95%. 1) Channel Profile Prediction: Figure 1 shows the average prediction accuracy of channel profiles in a simulated LTE system, at various input diversity, for time-domain and frequency-domain signals. Without input diversity (i.e., one input/prediction), the prediction accuracy is unacceptably below 90%. The more input diversity, the higher the accuracy. Specifically, the prediction accuracy is at 99% using five inputs per sequence. Binary prediction is not needed as the prediction accuracy of traditional multi-class prediction is sufficiently high. The prediction performance of frequencydomain signals is comparable to that of time-domain.
Similarly, Figure 2 shows the average prediction accuracy of channel profiles in a simulated 5G system. The prediction accuracy of the traditional multi-class predictions, for timedomain and frequency-domain signals, is unacceptably below 80%. Input diversity can not improve predictions. However, binary prediction, which is used to reduce prediction uncertainty and complexity, enhances prediction accuracy to 90% and 94% respectively for input diversity of five. Input diversity is also effective for binary prediction. The prediction performance of frequency-domain signals is slightly better than that of time-domain.
2) Doppler Shift Prediction: Figure 3 shows the average prediction accuracy of Doppler shifts in simulated LTE and 5G systems, at various input diversity, for time-domain and frequency-domain signals. Without input diversity, the LTE prediction accuracy is unacceptably below 70%. The more input diversity, the higher the accuracy. Specifically, the prediction accuracy is at 97% using five inputs per sequence. Binary prediction is not needed as the prediction accuracy of traditional multi-class prediction is sufficiently high. The prediction performance of time-domain signals is better than that of frequency-domain.
On the other hand, the 5G prediction accuracy of the traditional multi-class predictions, for time-domain and frequencydomain signals, is unacceptably below 30%. Input diversity can not improve predictions. However, binary prediction consistently enhances prediction accuracy to 88% and 94% respectively using five inputs per sequence. Input diversity improves binary prediction. The prediction performance of frequency-domain signals is slightly better than that of timedomain.
3) SNR Prediction: The prediction accuracy of SNR in a simulated LTE system using the traditional multi-class predictions for time-domain signals is at 100%. Figure 4 shows the average prediction accuracy of SNR, at various input diversity, for frequency-domain signals. The prediction accuracy of the traditional multi-class predictions, for frequency-domain signals, is unacceptably below 60%. Input diversity can not improve predictions. However, binary prediction, enhances prediction accuracy to 97% using five inputs per sequence. The prediction performance of time-domain signals is better than that of frequency-domain.
Similarly, the prediction accuracy of SNR in a simulated 5G system using the traditional multi-class predictions for  Figure 4 shows the average prediction accuracy of SNR, at various input diversity, for frequency-domain signals. The prediction accuracy of the traditional multi-class predictions, for frequency-domain signals, is at 99%. Input diversity does work for binary prediction. Binary prediction is not needed as the prediction accuracy of traditional multi-class prediction is sufficiently high. The prediction performance of time-domain signals is better than that of frequency-domain.

1) Neither time-domain nor frequency-domain signal data
is the better prediction input for all three applications. Both need to be evaluated.
2) Input diversity enhances prediction accuracy in all cases for time-domain or frequency-domain signal inputs and multi-class or binary prediction.

3) Binary prediction enhances prediction accuracy for timedomain or frequency-domain signal inputs. 4) Combining input diversity and binary prediction in-
creases prediction accuracy more than those achieved separately. 5) The hybrid CNN-LSTM model is essential and effective for input diversity multi-class or binary prediction.

V. FUTURE RESEARCH
The followings are potential future research ideas, utilizing and extending the methodology for DL prediction applications. Upon completion of research ideas, the proposed methodology shall be validated and considered as a universal DL prediction methodology in wireless communications, paving the way for an era of DL for Signal Processing, similar to DL for Computer Vision and Natural Language Processing, and applications in other domains.
1) The current methodology uses either the frequencydomain signal data or the time-domain signal data as prediction inputs. We recommend investigating with combined frequency-domain and time-domain signal data in every input. CNN is known to disregard noises (i.e., uncorrelated data) and extract correlated data for optimal predictions. 2) The grid input size is adopted from a MATLAB example and is effective without modifications. There are opportunities to optimize prediction performance and data efficiency by experimenting with the grid size.
3) The CNN model is adopted from a MATLAB example and is effective without modifications. There are opportunities to optimize prediction performance and model complexity by exploring model parameters such as the number of layers or filter size. 4) Simulated signal data is currently generated based on single antenna LTE and 5G systems. Since Multiple-Input and Multiple-Output (MIMO) LTE and 5G systems are widely used, it is recommended to investigate the methodology with simulated signal data from MIMO LTE and 5G systems. The current input diversity is designed by grouping like labels inputs and therefore does not possess correlation inherently embedded in MIMO signal data. Since the LSTM model is known to learn cross-correlation in time-series data, increased learning from MIMO signals is expected. Moreover, prediction latency will be reduced to one LTE subframe or 5G slot duration on MIMO systems, whose number of antennas is equal or greater than the number of inputs in each sequence. 5) Sequentially collected real-world signal data on each antenna (i.e., time-series) are inherently correlated (e.g., similar noises, Doppler shifts, channel profiles); thus, the LSTM model is expected to learn better with real time-series data.
6) Simulated signal data is excellent for exploring and investigating new concepts; however, experimenting with synthetic data is required to put the concept to practical use. On the other hand, collecting and forming real signal datasets are costly due to estimating and labeling ground truth values (e.g., noises, Doppler shifts, channel profiles). 7) The methodology is demonstrated to be effective in predicting channel profiles, Doppler shifts, and SNR. It is recommended to use it to investigate predictions of other parameters in wireless communication such as signal, modulation, SINR. Moreover, the methodology can be extended for applications in other domains. 8) Numerical channel profiles with multi-path powers, angles, etc., are difficult to estimate or predict. As a result, CSI is the current parameter of choice for channel estimation. Our methodology is successfully used to predict categorical channel profiles. However, it is recommended to design more fine-grained categorical divisions of the channel profiles (e.g., ten or twenty) for more comprehensive and informative channel profile predictions.
VI. DEEP LEARNING FOR SIGNAL PROCESSING Similar to DL for Computer Vision and Natural Language Processing, DL for Signal Processing is initially based on the effectiveness of the proposed methodology and extends its applications, capabilities and performances. Signal types from various application domains can be considered such as electrical, audio, pressure, and from various environments like under-water, space, and human body (e.g., from pacemakers). The capabilities can be enhanced in a use-case specific framework like those in CV and NLP. Finally, improved prediction performances are pursued and anticipated, compared to traditional statistical estimation techniques.

VII. CONCLUSION
This paper reviews and discusses three recent DL applications that predict channel profile, Doppler shift, and signalto-noise ratio (SNR) for LTE and 5G wireless communication systems. For prediction robustness, signal dataset generation via MATLAB simulation emulates the real world conditions, including wide ranges of randomized channel profiles, Doppler shifts, and SNRs. The methodology, comprising a hybrid CNN-LSTM model, input diversity, and binary prediction, is proven effective, demonstrating prediction accuracy at 95%. Lessons learned are that input diversity and binary prediction can enhance prediction accuracy when standard single input single output and multi-class prediction underperforms. The hybrid CNN-LSTM model is essential and effective in multiclass or binary prediction with input diversity. Additionally, potential future research ideas are identified. The goal of the review is to bring forward a universal DL prediction methodology for signal processing in wireless communications and beyond. This work received computational support from UTSA's HPC cluster SHAMU, operated by University Technology Solutions.