Generalization Ability of Deep Learning Algorithms Trained Using SEM Data for Objects Classification

This paper proposes an efficient method to determine the material of spherical objects and the location of the receiving antenna relative to the object in bi‐static measurements using supervised learning techniques. From a single observation, we compare classification performances resulting from the application of several classifiers on different data types: the Ultra‐Wide Band scattered field in time and frequency domains and pre‐processed data from the singularity expansion method (SEM) which has seldom been used in classification because it is considered to be noise sensitive. We selected a robust SEM technique which is vector fitting to decompose the frequency response into complex natural resonances (CNRs) and residues. Indeed, CNRs are aspect independent and therefore, can be used to discriminate the objects. However, the residues associated to each pole depend upon the aspect angle, and hence, they were never exploited. In this paper, we propose a novel use of those residues. Additionally, we construct an original data set using SEM data in order to further improve the robustness to noise and the generalization capacity of the learning algorithms. The advantages of using SEM data for object classification are highlighted by comparing it with raw scattered field data in time and frequency domains where the classification algorithms are optimized in each case. The results are very promising, especially in terms of generalization, robustness to noise, and computation time, which are all reasons to take an interest in SEM for these purposes.

object and are independent of the incident and aspect angles, making the SEM a very interesting method in an operational context when the target position is not completely controlled. In fact, the SEM is widely used for identification where it has been applied to study the extracted natural frequencies from Perfect Electric Conductor (PEC) objects (Chauveau et al., 2007b;W. Lee et al., 2012) or dielectric objects (Chauveau et al., 2007a;Chen, 1998). It has also been investigated to measure the maturity of fruits (Leekul et al., 2014;Tantisopharak et al., 2016).
Indeed, combining the SEM with a classification algorithm is an interesting research topic that has been seldom exploited and has been applied to limited directions of scattered fields. The work found in literature deals with simple classification cases: bi-class (J. H. Lee et al., 2003) or multiclass (Garzon-Guerrero et al., 2013) but for the purpose of classifying the size of four homothetic objects. These studies have highlighted not only the evident qualities of the SEM (excellent classification from few data and no matter the aspect angle of the target) but also its limitations to noise that affect the higher order poles. They have therefore limited their approach to the first one or two complex natural resonances (CNRs) to build the data set. Moreover, they have trained their classifiers at different SNR levels or have applied Principal Component Analysis (PCA) before the CNR extraction to deal with the intrinsic weakness of SEM to noise. These solutions either complicate the constitution of the data set or increase the computation time before classification. Indeed, PCA can be used for noise reduction and data compression of raw data, however, its computational cost is not very compatible with an operational context.
Our first objective is to address the multi-class classification of spherical objects to discriminate spheres of different materials. In this work we focus on using supervised learning algorithms for the classification process where each data is labeled. Classifying an object's material regardless of its size is an interesting task that has not been treated using SEM data. We chose to study the multi-class SVM, DT, and ANNs classifiers as SVM and DT are two of the most basic and robust ML algorithms, while ANNs are more advanced and known for their reliability. To validate the interest of using SEM data, we compare classification performances when using raw data (frequency and time domain responses) and SEM data. For this purpose, we construct the data sets using only noiseless data. Particular consideration will be given to the constitution of the SEM data set to obtain compact, informative, and generalizable data that allow to classify an object from its noisy response without adding PCA stage. In addition to the extracted CNRs with the SEM, we also exploit other parameters which are the residues associated to each pole. Those residues are dependent upon the aspect angle and have therefore not been exploited until now. In fact, when added to the data set along with the CNRs, the residues contribute in improving the object and material classification. Moreover, as the residues vary with the observation angle, they can be considered for location detection of the observer when doing measurements in a bi-static manner. For that, our second objective is to determine the observation angle of the receiving antenna by dividing the space surrounding the sphere into multiple angular sectors.
The paper is organized as follows: In Section 2, we present a brief description of each classifier used in this work. Section 3 deals with the generation of UWB scattered fields for multiple sphere types and the pre-processing step to extract the singularities using vector fitting (VF) algorithm. Then, in Section 4, we start by classifying the sphere's material where we construct the data sets from SF and SEM data. We also show how the algorithms generalize to other sphere dimensions, permittivity variations, and noisy data not included in the training data sets. Following, in Section 5, we present the classification of observation angle of the simulated spheres using raw SF data and the residues. To our knowledge, it is the first time that residues have been used to determine the position of the observer relative to an object. Finally, we discuss and conclude the work in Sections 6 and 7, respectively.

Supervised ML Algorithms
In this section, we briefly explain each one of the classifiers used within this work. The SVM and DT algorithms are implemented with built-in functions in Python using scikit-learn library (Pedregosa et al., 2012). The ANNs are implemented using the Keras-Tensorflow library tool in Python.

Multi-Class SVM
The SVM is originally a binary classifier that consists in finding the optimal hyperplane to separate different data classes. Under the assumption that the data are linearly separable, this plane is found by solving the problem 10.1029/2022RS007487 3 of 18 in (Schölkopf & Smola, 2001). If they are not linearly separable, SVMs employ the kernel method to map the data onto a higher dimensional space where they become linearly separable (Schölkopf & Smola, 2001). In order to apply the SVM in a multi-class configuration, we use the one versus one (ovo) approach (Hsu & Lin, 2002). This concept is based on constructing ( − 1) 2 classifiers (given k classes) and then treating the problem as a binary one. The hyperparameters of the SVM are C and γ. C is a regulation parameter, which adjusts the width of the margin to minimize the training error. γ is the data variance. The "gridsearch" function is used to find the optimal kernel function, and the optimal values of C and γ.

Decision Tree
DT methods build a decision model based on the actual values of attributes in the data. The data set is recursively divided into smaller homogeneous subsets, resulting in a flowchart-like tree structure. They use various techniques to split the data like Gini, information gain or entropy (Loh, 2011). The minimum amount of samples required to split a decision node is two.

Multi-Layer Perceptron
One of the well-known ANN models is the Multi-Layer Perceptron (MLP). It comprises an input layer with a number of neurons equals to the number of input data which simply pass information to the next layer; one or several hidden layers composed of various amount of neurons; and an output layer with a number of neurons equal to the number of classes. The neurons of the hidden layers and the output layer are called perceptrons. A perceptron is a neuron that is connected to the output of the previous layer and whose output is connected to the neurons of the next layer. It uses a non-linear activation function that is applied to a sum of products of the inputs related to their weights (Sze et al., 2017).

Convolutional Neural Network
Convolutional neural networks (CNNs) are composed of multiple layers of different types. Usually, the most common layers are convolution, pooling, and fully connected layers. The Convolution layer is a filter layer, of specific length and width, that moves along the input data. It is composed of kernels or filters that are applied to extract specific features from the input vector (Sze et al., 2017). The pooling layer is a sub-sampling layer used to reduce the size of the features computed using the convolution layer. Here, we use the maximum pooling layer that outputs the maximum input value. Fully connected layers are used to process the output features of the last convolution or pooling layer. They are composed of neurons connected to all previous layer's neurons.

Scattered Field Data Generation and Pre-Processing
The UWB SF from five different classes of spheres is computed in the far-field region from 0.01 to 5 GHz, for both mono and bi-static configurations, using Mie series algorithm implemented in Matlab (Mie, 1908).
The SF is recovered for both field components E θ and Eϕ. The five classes of spheres are: PEC sphere, sphere with a relative permittivity ϵ r = 4 and conductivity σ = 0.5, three lossless dielectric spheres with ϵ r equal to 2, 4, and 9. They are enumerated from 0 to 4, respectively. All spheres are illuminated by an x-polarized incident plane wave propagating along z axis. The SF is recovered for multiple observation angles where θ varies from 0° to 180° with 5° step and ϕ from −180° to 180° with 10° step. θ and ϕ are defined from the standard 3D cartesian coordinate system. Figure 1 shows the amplitude of the SF frequency response in the back-scattering direction (θ = 180°) for the five simulated sphere classes of 10 cm diameter.
The pre-processing of the SF is done by extracting the resonances and their associated residues from the frequency response of each simulated object. VF is a widely used method for fitting a measured or simulated frequency domain response H(s) using the following rational function (Gustavsen & Semlyen, 1999): with M being the model order, s = jω, a m the poles, and R m the residues. d and e are optional real constants (Gustavsen & Semlyen, 1999). The poles are of the following form: a m = −σ m + jω m , where σ m is the damping factor and ω m is the natural pulsation of the mth singularity. VF is chosen to extract the object CNRs because we have shown in a previous work that this method is more robust to noise and more accurate than other SEM techniques (Zaky et al., 2020).
The first step in the VF algorithm is to identify the CNRs by assuming a set of starting complex poles that are uniformly distributed over the frequency range of interest. Then, through successive iterations, the algorithm relocates them toward the actual poles (Deschrijver et al., 2008;Gustavsen, 2006). If the data are noise free, two iterations are enough to get accurate CNRs under the condition that the number of starting poles is greater than M. In presence of noise, further iterations are needed for convergence. Once the CNRs have been determined, the residues can be computed by solving the corresponding Least Square problem.
We also compute the quality factor (Q-factor) associated to each pole (Chauveau et al., 2006). In this work, we propose to constitute the data set using this Q-factor instead of the commonly used damping factor. This parameter is an important representation of the object as it is independent of its size which will help in generalizing the classification to spheres of different sizes. It is computed as follows: Thus, the Q-factor allows to describe an object's resonance behavior. Strong resonating objects have resonances with high Q-factor, whereas weak resonating objects have low Q-factor. From Figures 2 and 3 we see that all classes of 10 cm diameter spheres exhibit 5 CNRs in the frequency range of interest. When the relative permittivity of the dielectric material increases, the sphere becomes a very strong resonating object, hence, it has very high Q-factors. In Figure 3, we do not show the rest of the Q-factors for both class 3 and 4 for visibility as they are higher than 15.

Classification of Sphere Material
In this part, we describe the construction of three different data sets that will be used to train the classification algorithms. The first two data sets are directly composed of the SF responses in frequency and time domain, respectively. The third data set is the one integrating pre-processed data by the SEM. The constitution of these data sets is of primary importance as it directly impacts the performance of the classification algorithms. Our goal is to create a data set that strengthens the robustness to noise of VF and improve the generalization ability of classifiers. The comparative results using different classifiers fed with these data sets will be presented and discussed in this section.

Data Set Construction
We start by creating data sets including the five sphere classes. To have a balanced comparison between SEM and SF data, 13 sphere sizes are  simulated for each of the five classes. In fact, having multiple sphere sizes represents an advantage for SF data as the object's response gets affected by a change in the size. For SEM data, the resonant frequencies and damping factor also change, however, the use of the Q-factor helps to overcome this change. Those dimensions are selected to ensure that each sphere exhibits one to five natural frequencies within the frequency range ([0.01 -5] GHz). Table 1 shows the diameters of the spheres where N represents the number of resonances in the frequency range. For spheres having N ≥ 5 resonances in the frequency band, we chose the diameters varying from 10 to 18 cm with a 1 cm step, resulting in nine objects for each class. For spheres having less than 5 resonances (N < 5 resonances), we chose diameters which provide N resonances, with N varying from 1 to 4.
To reduce the size of the data sets, we include data from only three main planes; ϕ = 0° and θ varies (XoZ plane), ϕ = 90° and θ varies (YoZ plane), θ = 90° and ϕ varies (XoY plane) (it should be noted that creating data sets with all observation angles yields similar classification results). Thus, for each sphere size we have 111 observation angles (3 planes and 37 angles in θ or ϕ directions), making a total of 7,215 samples for each of the three data sets as we have 13 sphere sizes for each of the five classes.

Scattered Field Data Set
We construct a first data set using the amplitude of the simulated SF in frequency domain. The classifiers used within this work do not support complex numbers, thus, we include only the amplitude response. A preliminary study showed that adding the phase to this data set was not relevant as it doubled its size without improving its performance. The input vector is, hence, a 1-D signal of 500 frequency points and composed of two parameters. The first channel represents the θ component and the second one is the ϕ component of the SF response.
The second data set represents the SF in time domain. The transient impulse response is computed using inverse Fourier transform of the complex SF. Here, we include the first 10 ns of each signal, which constitutes 100 time points, as it starts to decay after. The input vector is also a 1-D signal of 100 points and composed of two parameters, one for each polarization.

SEM Data Set
To construct this data set, we pre-process the complex frequency response using VF as described in Section 3. The first five natural frequencies of each object and their respective Q-factors and residues form this data set's input vector. For the smaller spheres having less than 5 resonances in the frequency range (see Table 1), we need to complement the missing data with zeros to preserve the same input vector's length. This ensures that information from other columns is not lost, and predictions can be made despite the missing values. Hence, we will have sparse SEM data in this data set. This original configuration is chosen to improve the generalization performance of the classifiers when limiting factors (limited bandwidth, noise, etc.) impact the higher order poles.
The input vector is, thus, of length 5 and composed of four parameters: the natural frequencies, the Q-factor and the amplitude of residues of θ and ϕ components respectively. The residues are included even though they are aspect dependent, as they contain additional information about the objects which can be significant when the natural frequencies and Q-factors are almost similar for some objects. We also test more conventional approaches for building the SEM data set to find the optimal one. Thus, we will have the following three cases: • case 1: An original data set as described above (four parameters: natural frequencies, Q-factors, and their respective residues); • case 2: Q-factor is replaced by the damping factor (four parameters: natural frequencies, damping factors, and their respective residues); • case 3: we eliminate the residues from case 1, hence, the input vector has only the first two parameters (natural frequencies and Q-factors only). (1) ϵ r = 4, σ = 0.5 3 4.5 6 8 (2) ϵ r = 2 4.5 6 8 9 (3) ϵ r = 4 3.5 5 6 7 (4) ϵ r = 9 2.5 3.5 4.3 5

Table 1 Thirteen Simulated Sphere Diameters (cm) Having Different # of Resonances (N) for Each Sphere Class
In the rest of the paper, we will use the following abbreviations to refer to the data sets: FD data, TD data, SEM data which are frequency domain, time domain and SEM data respectively. Note that cases 1, 2, and 3 only refer to SEM data.
The following flowchart, shown in Figure 4, represents the three approaches that will be studied for object classification using the previously presented algorithms.

Classification Results
Each data set is split into 80% for the training and 20% for the testing. The remaining 20% are composed of random observation angles not present in the training set where each class has an equal number of samples. In a further step, we test noisy data where they are affected by an additive white Gaussian noise (AWGN) with different levels of SNRs. Following, we present results of the generalization ability of the classifiers. We begin with spheres with diameters higher than those included in the training data sets (>18 cm). Increasing the sphere's diameter means, physically, that the first five natural frequencies are shifted down into the frequency band. For that, we simulate new spheres of 19 and 30 cm diameter for each class. We also simulate smaller spheres (Table 2) with less than 5 resonances and different than those simulated in Table 1 (but smaller to test the generalization capabilities with natural frequencies being shifted up into the frequency band). Results presented are averaged over 10 runs.

Training Phase
During this phase, the parameters of each classifier are tuned to achieve highest accuracy results for training data. First, the SVM parameters are fixed according to the results using "gridsearch" function. For that we defined, for each data set, a matrix of C and γ values, the rbf kernel function and ran the 5-fold cross-validation. Results of optimum values are given in Table 3. Notice that the training parameters are much smaller for SEM data set where it uses a simple linear function, because the data are easily separable compared to the other data sets.
Second, for the DT algorithm, the Gini criterion is used to measure the quality of the split in the tree and decision nodes are randomly chosen to be further split.
Then, when using the MLP classifier, we apply one hidden layer while varying the number of neurons. Note that we have done tests with other topologies by increasing the number of hidden layers and have found that the performances are almost similar. We use the tanh activation function with SEM data and Rectified Linear Unit (ReLU) for SF data. The learning rate is updated using the Adam optimizer (Kingma & Ba, 2014). Finally, the CNN is tested. (1) ϵ r = 4, σ = 0.5 2.5 4 5 7 (2) ϵ r = 2 4 5 7 8.5 (3) ϵ r = 4 3 4 5.5 6.5 (4) ϵ r = 9 2.2 3 3.8 4.5 For SF data we adopt the LeNet-5 architecture (Lecun et al., 1998) where we change the filter size to be applied on 1-D input signals. For SEM data, we apply one convolutional layer with six filters, followed by one hidden layer with 32 neurons and ReLU function applied in all layers. This is run over 100 epochs for SEM data and 400 epochs for SF data with a batch size of 32 for both MLP and CNN classifiers. They both have an output layer composed of five neurons using the softmax activation function to compute the output probabilities.
Finally, Table 4 shows the execution time values recorded during training phase of the four classifiers for each data set. The values shown for SVM take into account only the training using the optimum parameters chosen earlier. The MLP and CNN classifiers consume much more time than SVM and DT as their calculations are more complex. In addition, the training runtime of classifiers using SEM data is much faster than those using raw data. The IFFT used to obtain time responses takes 0.05 s when applied on a single observation angle. Additionally, the SEM pre-treatment using VF accounts for 0.08 s for a single observation angle.

Test Phase
First, we present the mean accuracy results (% of data points classified correctly) when testing the 20% remaining samples of each data set for various amount of neurons using MLP classifier.
In Figure 5a, we can see that, for SEM data (case 1), starting from 32 neurons we obtain excellent accuracy (>98%) while for SF data sets 256 neurons yields the highest accuracy of 97%. The number of parameters (weights) computed in the hidden layer increases with the number of neurons, as shown in Figure 5b where the FD data has the highest parameters as it has the longest input vector. Thus, we can see that the SEM data set has both the highest accuracy and the lowest parameters computed using the MLP.
In addition, we test the 3 cases of SEM data set construction. Figure 6 shows that the MLP classifier trained using SEM data of both cases 1 and 2 has a higher accuracy (100%) for a number of neurons >32 and is more stable (low variance) than the classifier trained without the residues. In fact, from Figure 3, we see that some classes have very close natural frequencies and Q-factor, that is why, when trained with case 3 without residues, it becomes more difficult to separate those classes. This also shows that residues are indeed parameters that contain information about the object. The classifiers, DT, SVM and CNN have similar results and with a high variance when trained without residues. Thus, for the rest we will only work with SEM data sets of case 1 and 2 including the residues.
The performances of the classifiers are also evaluated with the following metrics: the Sensitivity (Sens) and the Specificity (Spec) of each class. Sens is the probability of classifying a sample as True Positive and Spec is the probability of classifying as True Negative and they vary from 0 to 1. They are computed as follows: FN is False Negative and FP is False Positive. Table 5 shows the accuracy performances of the classifiers when applied to the different test data sets. We can see that the SEM data (case 1 or 2) yields the highest recognition rate for all classifiers where the Sens and Spec are of 1. Additionally, the CNN model has better performances than SVM and DT when trained using FD or TD data. In this case, the Sens and Spec are higher than 0.97 for all classes. Thus, we see that the classifiers trained using SEM data holding the residues are more capable in distinguishing spheres having close characteristic poles and provide higher accuracy results while consuming much less computational cost than classifiers trained using raw data.

Noisy Data
Now, we test the classifiers' ability to handle noisy data. Given that generating a data set with different SNR levels is challenging, we opted to assess the robustness of our classifiers to noise by evaluating them on noisy data that were not seen in the training phase. We chose the 15 cm diameter sphere as its noiseless response is already included in the training data set. Several AWGN levels are added to the SF response of the five spheres of 15 cm diameter. The SNR values are chosen such that 65 dB is one of the highest values that can be obtained in the anechoic chamber, whilst 10 dB is a low value. As in the case without noise, VF is applied to extract the CNRs and residues from the noisy signals. In (Zaky et al., 2020), authors show that the noise highly affects the damping factor values more than the natural frequencies. Consequently, the Q-factor will also be affected by noise as it is computed using the damping factor. Note that when using VF with very noisy signals, a minimum of 10 iterations are needed for good convergence of CNRs and the model order M should be carefully selected where it should not be too high to avoid numerous poles related to noise that affect the convergence of actual CNRs.
Accuracy results of the classifiers are shown in Figure 7. For all SNR values, both ANN classifiers trained using SEM data (case 1 or 2) achieve high accuracy results that are better than SF data. The Sens and Spec are higher than 0.99 for all classes. For 10 dB SNR, results start to decrease but are still high and similar to those obtained in (Garzon-Guerrero et al., 2013) for the PEC sphere while, as opposed to our work, a PCA stage was added to reconstruct the SF response prior to the CNRs extraction. However, SVM and DT are unable to determine the classes accurately for the 20 and 10 dB SNR where they have 40% and 20% accuracy respectively as they miss-classify all the samples of some classes. Those results can be explained by the fact that the SVM and DT  classifiers are basing their decisions only on the Q-factor as it is constant for all sphere dimensions and in the presence of noise it is highly perturbed which in turn perturbs the classification. On the contrary, ANNs are able to classify the noisy SEM data as they are capable of taking advantage of the extra information provided by the original structure of the SEM data set (sparse data, Q-factor and residues) proposed in this work.
For FD and TD data, we can note that the CNN performs better than the other classifiers and has good performances for high SNR values. However, results start to degrade for SNRs lower than 20 dB as seen in Figure 7. From Table 6, we see that for 10 dB SNR the Sens and Spec deteriorate for some classes. Hence, we can conclude that the association of VF with an original input vector structure and ANN classifiers compensates for the noise sensitivity of SEM methods which then outperforms the results obtained using raw data, even at low SNR levels.

Generalization Using Different Sphere Sizes
We test the generalization ability of all classifiers on larger and smaller spheres not included in the training data set. First, for larger spheres, the accuracy results shown in Figure 8 indicate that all classifiers trained using SEM data of case 1 perform an accuracy of 100% for all classes and for both 19 and 30 cm diameters. The Sens and Spec are of 1. These results are due to the Q-factor, associated to each natural frequency which is constant  Figure 7. Accuracy (%) of noisy data when testing several classifiers using the three data sets.
independently of the sphere's size. On the contrary, the SEM data of case 2 did not achieve accuracy results as high as case 1 which shows that replacing the damping factor by the Q-factor is important to be able to properly generalize the classification to sphere sizes larger than those included in the training data set.
Comparing SEM data (case 1) with TD and FD data, we find an overall gain of nearly 4% and 6% for SEM data respectively with CNN classifier when testing the 19 cm sphere diameter. Moreover, for larger spheres (30 cm diameter), results deteriorate for classifiers trained using raw data, as seen in Table 7. This shows that classifiers trained using SEM data can be, efficiently, generalized to larger spheres having at least 5 resonances in the frequency range, whereas classifiers trained using SF data sets are unable to classify larger spheres whose sizes are not included in the training data set.
Second, we test smaller sphere dimensions not included in the training data set. Table 8 shows the accuracy results, using CNN, for data with only N < 5 resonances in the frequency range as presented in Section 3 (the other classifiers have almost 20% less accuracy for SF data and same accuracy for SEM data). By using CNN, the sparse SEM data have 0% error for different N resonances except for N = 1 where it miss-classifies some samples from class 0 as seen in Table 9. However, for the SF data sets, when the dimension decreases (i.e., smaller N), the accuracy decreases drastically where the Sens of some classes are lower than 0.5 for N = 1. This shows that the pre-processing of the SF using the SEM method and the computation of the Q-factor to replace the damping factor, is an important step in a classification process in order to distinguish spheres that have different diameters than those included in the training data set.

Generalization to Spheres of Different Permittivity
Here, we will test the generalization ability of the classier to spheres having different permittivity values than those included in the training set. This approach is tested with spheres whose permittivity are ±10%-20% of ϵ r for classes 2, 3, and 4 where ϵ r is respectively equal to 2, 4, and 9.
In Figure 9, we show the results obtained with FD, TD, and SEM data when using the CNN. We can see that we can generalize well to all three classes when varying the sphere's permittivity. We also notice that we obtain higher accuracy levels when using SEM data.
A final test was realized with SEM data where we varied the sphere's permittivity from 2 (class 2) to 4 (class 3) with 0.2 step and recorded the accuracy performances obtained with CNN. Curves on Figure 10 shows the  accuracy obtained. One can note a symmetry in the evolution of theses two curves and an accuracy of 50% is reached for a permittivity of 3.
Similarly to the sphere's size, when testing spheres having different permittivity values, the algorithm does not overfit and is able to generalize well by approximating the result to the nearest class.
Finally, results presented earlier indicate that combining the SEM data with MLP or CNN classifier is very robust as it tolerates permittivity variations and allows to classify spheres with a diameter different than those included in the training phase with a rate higher than 98% even when there is only one resonance in the frequency band. In addition, it still provides good results in the presence of noise where it outperforms raw SF data sets even at low SNRs. Although the SEM is known for its sensitivity to noise, these performances can be explained by a combination of three major elements. First, the use of VF which is an efficient SEM method chosen for its robustness to noise and that has been seldom used for object characterization (Zaky et al., 2020). Second, the sparsity of the proposed original input vector integrating not only the natural frequencies but also their respective Q-factors and residues. Third, the use of ANN classification algorithms which take advantage of this informative data set and make the difference at low SNR. This association allows to obtain better classification results without adding noisy data in the training step. Moreover, the proposed compact input vector ensures a considerably reduced computational cost during training than methods using raw SF data.

Classification of Observation Angle
After classifying the sphere material in the previous section by the use of SEM data, the second stage is to determine the observation angle (i.e., the angle defined by the receiving antenna relative to the transmitting antenna in a coordinate system for which the sphere is the center). For this purpose, we split the space surrounding the sphere into eight angular sectors, where each sector contains various observation angles in θ and ϕ directions, as seen in Figure 11. We only treat one quarter of the sphere as the radiation pattern of the scattered field by the sphere has two symmetry planes. The objective is to determine the angular sector containing the scattered wave vector (which is equivalent to determining the direction of the receiving antenna) using the classification algorithms. Each angular sector includes the following observation angles:

Data Set Construction
For each of the five spheres of different materials simulated in Section 3 we create data sets that contain eight classes corresponding to each sphere's angular sectors. In every data set we include the 13 sphere sizes simulated in Section 3. Thus, each data set holds 4,810 samples (37 (θ) × 10 (ϕ) × 13 sphere sizes).
For the FD and TD data sets, they are constructed as in Section 4.1.1 with same length. Thus, for frequency responses, we include the amplitude of both E θ and E ϕ field components, while for time response, we include the first 10 ns of the signal for both field components.   For the SEMs data set, the input vector is of length 5 and comprised of two parameters representing the amplitude of residues related to both field components E θ and E ϕ respectively. The natural frequencies and the Q-factor are eliminated as they do not vary with the aspect angle. Additionally, the data set contains sparse data due to the reasons mentioned in Section 4.1.2.
In fact, using residues over raw SF data will present some merits which are: first, we have much less data in the input vector and second, the amplitude of the residues is independent of the object's size. Indeed, the natural frequencies at which the residues are derived are inversely proportional to the size of the sphere: for each sphere type, the residues are independent of its size since they are computed at the same electrical length. Consequently, they are unique to each sphere but informative about the angle aspect of the sphere. Thus, determining the observation angle using the residues will be faster than raw data, however, as there is less data, the residues might be more sensitive to noise.

Classification Results
The results of the observation angle classification averaged over 10 runs are presented. As for the material classification process, we divide each data set into 80% for train and 20% for test. Then we test the classification using noisy data and larger or smaller sphere sizes not included in the training data sets.

Training Phase
As in Section 4.2.1, the SVM parameters are found by using "gridsearch" function. Table 10 shows the optimum training parameters. Due to the complexity of the problem, the training parameters are higher than before for SEM data.
The DT classifier's parameters are also left random with the Gini criterion being used to measure the quality of the split in the tree.
For the MLP classifier, we apply two hidden layers each with 64 and 256 neurons for SEM and SF data, respectively. Increasing the layers number does not change the results. We apply the ReLU activation function in both layers and for all data sets (SF and SEM). For the CNN classifier, the LeNet-5 architecture is also adopted for the SF data sets, while for the SEM data set we use one convolutional layer with six filters followed by one hidden layer with 64 neurons. The number of epochs is 256 for SEM data and 512 for SF data with a batch size of 64 for both MLP and CNN. The output layer is composed of eight neurons using the softmax activation function.  Table 9 Sens and Spec (%) for N = 1 Resonance Using Convolutional Neural Network Figure 9. Accuracy (%) of spheres having different permittivity using convolutional neural network with multiple data sets.

Test Phase
We start by testing the 20% remaining samples for the five data sets of each sphere. As the performances are almost similar for all spheres, for the sake of brevity we present the results of the PEC sphere and compare the performance of the SEM, FD, and TD data sets. Table 11 shows that SEM data has highest performances using ANN classifiers with 1% error rate. For SF data, there is 3% error using the CNN and SVM, whereas DT has the lowest performances where Sens and Spec values do not exceed 70% for all classes, hence, DT is not a classifier that is suited for this problem. This first test shows that, in a noiseless case, identifying the identifying angular sector containing the direction of the observer is possible when using the residues associated to each pole.

Noisy Data
Then, we test the same noisy data simulated in Section 4.2.3. The CNN classifier has the highest performances for SF data, while for SEM data the CNN and MLP classifiers have similar performances. Indeed, DT still does not perform well when testing with noisy data which shows that it is very sensitive to data variation.
Under high SNR values, SVM, MLP, and CNN trained using SEM data have almost the same levels of accuracy as classifiers trained using both TD and FD data. Nevertheless, when SNR decreases, as expected, there is a loss of almost 12% for residues compared to SF data (see Figure 12). In fact, as the SNR starts to decrease, the residues become highly perturbed as they are computed through the resonance poles which are also affected by noise. Only the residues associated to the first pole have the less distortions but as we will see in Section 5.2.4, having one residue is not enough to determine the sphere's angular sector accurately. It is also observed that the miss-classified samples actually exist at the border of the sectors, thus it remains acceptable. For the rest of the section, we will no longer use DT as it has the lowest performances for the three data sets.

Generalization Using Different Sphere Sizes
Finally, we test the larger sphere sizes simulated in Section 4.2. In fact, the residues' amplitude remains constant for different sizes as they are associated to the natural frequencies of the SF response. On the other hand, the FD and TD responses depend upon the object's size. Figure 13 shows that classifiers trained using both raw SF data are unable to determine the angular sectors of all spheres where the accuracy does not exceed 30% for the 30 cm diameter  sphere. In addition, the Sens and Spec values of most classes do not exceed 40%. On the contrary, classifiers trained using SEM data achieve high performances where the ANN classifiers have 0.9% and 3% error for 19 and 30 cm diameters, respectively. This confirms that classifiers trained using SEM data are able to identify the observation angle of larger spheres.
For smaller spheres, we test the classification capability on different sizes where the SEM test sets include N < 5 natural frequencies (and thus less than 5 associated residues). The accuracy results are listed in Table 12.
When the sphere's size decreases, classifiers trained using raw SF data become unable to detect the observation angle where for N = 1 (i.e., the smallest sphere) it is impossible to classify some classes as seen in Figure 14. We can note that the CNN classifier performs better than the rest, but it still has low accuracy when the sphere starts to have less than 4 resonances in the frequency band. For classifiers trained using SEM data, MLP, and CNN classifiers perform best. It is observed that spheres with more than 1 resonance in the frequency band are easily classified with only 6% error for N = 2. However, when there is one resonance it is difficult to accurately determine some of the sphere's sectors through the residues associated to the first pole ( Figure 14). This is because the amplitude value of residues associated to the first CNR are almost similar for some sectors. Nevertheless, the accuracy obtained using residues is 57% higher than those obtained with SF data which presents very promising results.
Those results prove that using the residues to determine the position of the receiving antenna is actually possible even when there is much less information concerning the frequency spectrum where we have 100 times and 20 times less data in the SEM data set than the SF data sets in frequency and time domains respectively. Table 13 summarizes the results obtained when classifying angular sectors of the five sphere types with CNN classifier trained using noiseless SEM data. Those results are presented for the following test data: larger sphere size of 30 cm diameter (ø = 30 cm), smaller spheres having less than 5 resonances in the frequency band and noisy data at 10 dB SNR.

Discussion
Combining the SEM with a classification algorithm is an interesting research topic that has been seldom exploited, certainly due to the noise sensitivity of the SEM. In this paper, we have attempted to show that appropriate choices can be made to maintain a good classification even at low SNRs, but above all we have highlighted the exceptional generalization qualities it provides. Indeed, the approach in this work is intended to be more global, with the objective of classifying the material of spherical objects of all sizes, from the scattered field in any direction, and without a priori knowledge about the nature of the noise. In this context, the structure of the proposed SEM input vector satisfies this objective. First, computing and including the Q-factor and the residues associated to each natural pole improves the robustness and allows the generalization to different object sizes. Second, the sparsity in the SEM data set (replacing higher order poles with zeros) is beneficial to overcome the bandwidth limitations for small objects and to improve the robustness to noise by decreasing the weight of higher order poles during the classification process. These performances, seen in Section 4.2, were also achieved with the use of more advanced classification algorithms and the use of VF which allows to extract the object resonances in a noisy environment with high accuracy. Indeed, our study has shown that at SNR levels ≤20 dB, the more basic algorithms (DT and SVM) do not take advantage of the additional information provided by the original format of the proposed SEM input vector where they miss-classify all samples of some classes. Conversely, ANNs are successful in taking advantage of the indirect but informative data (residues) and in handling sparse input data.   without applying PCA on SF data. In addition, the reduced SEM input vector size enables rapid convergence of the neural networks.
All these advantages open up new opportunities. First, by classifying the sphere's material, it is possible to determine the size of this sphere, which is directly related to the fundamental frequency of the first pole. Second, the classification of the observation angle from the residues is a novel problem that raises interesting perspectives, especially when applied to more complex objects than the sphere. This preliminary study, of identifying the  observation angle, yielded significant advantages in terms of computational cost and generalizability despite the sensitivity of residues to noise. In fact, since the computation of the residues is independent of the size of the studied object, it can be possible to construct the data set from a single object size. This will reduce the processing time but more importantly, opens new prospects for the data set construction as it is hard to compute the scattered fields for all dimensions of different objects.
However, some limitations can be noted in this study since, as in all papers dealing with SEM, we assume that the first natural frequency of the illuminated object is in the radar bandwidth, which implies a very low minimum frequency to classify large objects. Furthermore, the time required to compute the CNRs and residues by the VF algorithm may vary, mainly in low SNRs requiring more iterations and finer tuning of the initial parameters for a good convergence, which is to be integrated into the total computation time. Aware of these weaknesses, we believe that the gains in terms of processing time and computational cost remain largely in favor of the proposed approach. Moreover, considering the generalization to different object sizes and the improvements made in terms of robustness to noise, make this association a promising radar technique for object and observation angle classification.

Conclusion
The present work describes an efficient workflow to classify five different sphere materials and to determine the observation angle of the receiving antenna by dividing the sphere into eight angular sectors. Three data sets are constructed based on the raw scattered field, in time and frequency domains, and the proposed pre-processed SEM data. They are constructed from multiple noiseless responses from several sphere sizes where the SEM data has a specific sparse input vector including the natural frequencies and their associated Q-factor and residues. The comparison between classification on SEM and raw data sets confirms that the proposed method allows to classify from a single observation angle while being efficient, aspect independent, tolerant to potential permittivity variations, and with low computational cost. Moreover, the use of Q-factor instead of damping factor allows to accurately distinguish spheres of different sizes not included in the training data set. Finally, the sparsity of the SEM input vector associated with ANN classifiers allows also to maintain high classification rates even at low SNRs, without including noisy data in the training phase.
Following the sphere's material classification, we propose the use of residues to classify the observation angle of the identified object. The comparative classification results indicate that even though the SEM residues are  affected by noise, the proposed approach offers excellent classification rates over FD or TD data sets whatever the size of the sphere. This generalization capability facilitates the data set constitution and offers interesting perspectives in radar applications. It should also be noted that those performances could be greatly enhanced by including noisy data during the training phase. Hence, this preliminary study brought very promising results for future object classification with radar signal while using simpler but reliable and faster techniques than those using raw SF data. As a perspective, the study will be extended toward more complex non spherical objects. It will be also extended to an operational context by testing on measured scattered fields.

Data Availability Statement
The data that support the findings of this study are openly available in zenodo at https://doi.org/10.5281/ zenodo.7096131.