Analyzing Gas Data Using Deep Learning and 2-D Gramian Angular Fields

The notion of employing deep learning (DL) for gas classification has kindled a revolution that has improved both data collection measures and classification performance. Yet, the current literature, with its vast contributions, has the potential in enhancing the current state of the art by employing both DL and novel visualization methods to boost classification performance and speed. Therefore, this article presents a dual classification system for high-performance gas classification: on 1-D time series data and 2-D Gramian Angular Field (GAF) data. For the GAF case study, 1-D data are converted into 2-D counterparts by means of normalization, segmentation, averaging, and color coding. The gas sensor array (GSA) dataset is used for evaluating the implemented AlexNet model for classifying 2-D GAF data and an improved version of GasNet for 1-D time-based data. Using a cloud-based architecture, the two models are evaluated and benchmarked with the state of the art. Evaluation results of the modified GasNet model on time series data signify the state-of-the-art accuracy of 96.5%, while AlexNet achieved 81.0% test accuracy of GAF classification with near real-time performance on edge computing platforms.


I. INTRODUCTION
S INCE its inception, the electronic nose (EN) technology has developed advanced means to detect gases and odors based on the response pattern of a gas sensor array (GSA). In its basic form, an EN is primarily composed of a GSA, signal conditioning, and pattern algorithms. Also, gas sensors may be divided into several types, including metal oxide (MOX) sensors, electrochemical sensors, and conductive polymer sensors, among others. In particular, MOX gas sensors are extensively used industrially due to its compactness, responsiveness, and affordability [1], and thus, such sensors are frequently used in the identification of industry exhaust gases and combustible gases [2], [3], [4].
To correctly establish an effective gas identification system, the characteristics of the target gases must be studied [5]. Henceforth, pattern recognition is a useful tool to create models that encapsulate such characteristics to a high degree of precision [16], [29] and can efficiently classify the constituent components of a given gas mixture. Presently, models for identifying combined gases fall broadly into three categories: 1) gas chromatography-mass spectrometry models that can detect any gas combination with great repeatability, both systematically and objectively [6], and however, such methodology is costly and can be time intensive [6]; 2) models that use statistical analysis, machine learning (ML), or computer vision (CV) [7]; and 3) a sensor fusion model, which can be used to improve the accuracy by combining multiple statistical and/or ML approaches to identify gas mixtures [8]. Despite its popularity, it is worth noting that classifying gas data in their native format, i.e., as time series, has significant dimensionality and computational demands [9]. Yet, several new time series classification (TSC) techniques have been suggested in recent literature, including bags of slow feature analysis symbols, collective of transformation ensembles, simple exponential smoothing, autoregressive integrated moving average (AIMA), and dynamic time warping [10], [11], [12].
In addition, ML has been introduced in [13] to build upon TSC research, including support vector machine (SVM), decision trees, and neural networks.
Elevating from a single dimension, deep learning (DL) has evolved not only to solve TSC problems only but also 2-D classification conundrums [14], [15]. A popular DL model  I  ML TECHNIQUES USED FOR IMAGE-BASED GAS CLASSIFICATION is the convolutional neural network (CNN), which is used to handle complex classification challenges across a wide range of fields. Because of the shared-weights design and invariance qualities, CNN models can efficiently be applied in image processing, among other data formats [17], [18], [20].
For instance, the notion of employing a Deep CNN (DCNN) for gas classification is described in [21], where a well-known model named GasNet is developed. Other models include the LeNet-5 network [22], in addition to 1D-DCNN [23], where the latter model requires less data than other models to train its classifier. It is not surprising to witness the popularity of CNN-based models, where it can be effectively used to classify data in 1-D and 2-D [25], [26]. Speaking of 2-D data, Markov transition fields (MTF), Gramian angular fields (GAFs), and chaos game representation techniques can be used along the classification models to classify gas mixtures in a 2-D form [25], [29].
Accordingly, in this article, the authors set out to design enhanced feature extraction approaches using both 1-D and 2-D approaches and benchmark their respective performance against the state of the art. Henceforth, we summarize this article's contribution as follows.
1) Develop a novel 2-D gas data visualization technique that is developed based on the GAF transformation that represents time series gas data as pictorial representations. 2) Implement a DL-based method for classification of the generated GAF representations with less computation time and promising performance. 3) Develop a DL-based incremental enhancement to benchmark against existing time series-based gas classification techniques that achieve higher classification accuracy. The remainder of this article is structured as follows. Section II describes the related work, while Section III expounds upon dataset postprocessing and classification algorithms. Furthermore, Section IV summarizes and discusses the findings of the study and then concludes the work in Section V.

II. RELATED WORK
In this section, recent notable contributions are reviewed in the field of gas classification, with particular interest in ML-based work, which are summarized in Table I.
In particular, the work described in [21] describes a notable advancement in DL-based gas classification, encapsulated in the novel DCNN "GasNet" model. The architecture of GasNet consists of a global average-pooling layer, the fully connected layer, and 38 network layers altogether. To obtain relevant representative features, each convolution block has a total of six layers, consisting of two convolutional layers, two bulk normalization layers, and two rectification linear units (ReLUs). The data are grouped using the SoftMax activation function and a multineuron layer with full connectivity. Furthermore, due to the usage of an increasing number of different sensors, the regular gas data may have a complicated structure that is difficult to analyze. For unfolded data and the parallel factor analysis with linear discriminant analysis, two types of techniques named, 2-D linear discriminant analysis and partial least-squares discriminant analysis, have been used in this study [41] to classify and identify the eight MOX semiconductor sensors for EN.
On another note, to achieve target identification, concentration prediction, and status assessment all at once, Wang et al. [42] created a multitask learning (MTL) CNN model with a novel dual-block knowledge-sharing structure. The model obtained 95% accuracy in 4 s, and the results suggest that the sample size has a substantial effect on the model's generalization ability. Such a study demonstrates the significant benefits of integrating big data and DL into the EN systems.
In addition to improving the accuracy, Ma et al. [43] introduced a hierarchical classifier (HC) using a tree structure to achieve the categorization of numerous harmful gases in a furniture showroom. An extreme learning machine HC (ELM-HC) is described, which outperforms the standard ELM single classifier in terms of classification accuracy and generalization performance, particularly when the training data are unlabeled. A more comprehensive training dataset relating to humidity interference and sensor drift is required for this approach. The accuracy and performance of ML models may be vastly improved by enhancing the quality of the training data, which can be used for either classification or estimation.
Also, according to the research work in [44], three classification models and three regression models based on SVM, ELM, and BP are developed for quantitatively evaluating six categories of VOCs. Using classifier output as a regressor input creates an integrated model. The ELM-ELM combined model performs best in independent tests. The classification accuracy is 93% and R 2 is 0.94. ELM-ELM provides better efficiency. Improved model accuracy will boost training and fitting time.
Moreover, it is noteworthy to mention a low-dimensional DCNN research devised in [23]. The expansion of raw data into a 1-D vector (16 × 100) resulted in a 1-D vector of size 1 × 1600, which is used for feature extraction.

B. Encoded Image Data Classification
The approach of transferring original data to its analogousimage matrix data and then manipulating the benefits of CNNbased image processing for gas identification is suggested in [9]. Because the data dimensions are not being shrunk during the analogous-image matrix conversion, the raw time series data retain all its derived features, unless the model is quite complicated.
Moreover, backpropagation neural network (BPNN) and C-means clustering are proposed to classify gas mixtures using gray processing to produce grayscale pictures used to identify the gases in [39]. However, the most appealing models, in terms of performance, are the multiscale CNN (MCNN), fully convolutional network (FCN), GAF-MTF, relative position CNN (RPCNN) models, and recursive plot (RP) [45], [46], [47], [48]. Moreover, these techniques may be split into two groups based on the input dataset of CNN: MCNN and FCN employ 1-D input with the preparation of original time series data, whereas GAF-MTF, RPCNN, and RP transform basic time series data to 2-D images. A variety of operations on the original time series data are carried out using MCNN's multiscale and multifrequency branch.
Also, delving into DL methods, MLP [49], FCN, and ResNet [35] are implemented in TSC even without tailoring in feature extraction or data preparation, which implies that the 1-D unprocessed time series data are correctly given to the classification algorithm. With a straightforward protocol and minimal complexities for model construction and deployment, There are sizable challenges in converting time series data into images that are described in [50], which explores such obstacles in the scope of pattern recognition. The authors employ GAF to convert the basic time series dataset into polar coordinates, whereas MTF is employed to calculate the transition probability beside the time axis using a first-order Markov chain, which are handled as dynamic and static time series inputs. For the classification of time series, a tiled CNN model is presented in [24] and then an improved method is developed, where GAF-MTF modelling and Tiled CNN are combined in [46] algorithm are used. It is possible that the lack of comprehensive dynamic information provided by MTF is to blame for the approach's low classification results when compared with other leading-edge approaches.
Moreover, the approach was used by Hatami et al. [45], where they suggest an RP-based representation approach for TSC that applies a CNN model to transform time series into 2-D images. According to their findings, time series exhibit distinctive recurrent tendencies such as fluctuations and irregular cyclicities, which are typical phenomena of dynamic systems, and the primary goal of employing the RP method is to demonstrate the locations at which certain trends revert to a prior state. These RP-created images are then classified using a CNN model with two convolutional layers and two fully connected layers. However, because of the immature construction of the CNN model, the outcome of their method is imbalanced on the common datasets.
Continuing their research into image recognition using time series analysis, the researchers in [51] provided a framework for prefixing 2-D images, where time series data are converted into red, blue and green (RGB) inputs for the training of ConvNet, using time series analysis as inputs. After dimensionality reduction, 2-D images are created using three feature encoding techniques, all of which adhere to the convention: Gramian angular summation field (GASF), Gramian angular difference field (GADF), and MTF. After that, these 2-D images are combined into another single large image split via RGB channels, where binary classification is performed after being fed into ConvNet.
To maintain the integrity of image recognition within a time series, a novel intrusion pattern recognition approach that relies on GAF and CNN is presented [52]. Using the GAF technique, 1-D time series incursion signals may be transformed into 2-D images with higher detailed features while still preserving their time domain dependency. Therefore, when paired with the suitable CNN architecture, this GAF method can extract specific information about infiltration patterns and recognize them with accuracy. Relative position matrix and CNN (RPMCNN) is an advanced DL technique in TSC challenges, which uses an RPM to convert original time series data into 2-D images. A TSC technique is also suggested in [27], where using 12 conventional datasets for separate GAF, MTF, and GAF-MTF images, which results from integrating MTF and GAF representation into a single image and a tiled CNN is then done to acquire high-level characteristics from the resulting images.
After examining the literature, a more cost-efficient GAFbased method is needed to advance the state of the art in terms of classification accuracy and computational performance. Accordingly, we distinguish our novel contributions as follows.
1) Using a 2-D GAF-based image classification based on the AlexNet architecture. 2) An improved CNN-based DL algorithm GasNet is applied for gas mixture classification and identification, which further reduces label dimensionality.
III. EXPERIMENTAL DATA SETUP In this section, the data used in this work are described. As part of the BioCircuits Institute at the University of California San Diego, a gas distribution apparatus is used to collect a large-scale dataset known as the GSA dataset [28]. A comprehensive setup for the entire experiment is shown in Fig. 1. The GSA database stores time series records gathered from 16 chemical sensors subjected to varying concentrations of ethylene in the air. Each sample is compiled by performing a nonstop collection of the data emanating from the 16-sensor array over the course of approximately 24 h. The dataset used for this experiment is freely available in the University of California Irvine (UCI) ML Repository and contains 417 544 samples with 19 different properties. Analyses are conducted using model sensors, which represent how well sensors' conductivity changes when exposed to gaseous mixtures. In addition, the dataset is produced by employing combinations of binary gases with varying concentrations of each component. As a result, a variety of nonlinear changes in the sensors, each of which is produced by a distinct change in the intake, are incorporated into the dataset.
Due to its popularity, extensive testing with the dataset has been carried out for the categorization of mixed gases [21]. This data collection includes two binary mixtures of gases: ethylene-methane and ethylene-carbon monoxide in the air. The raw data at the input are collected using 16 MOXs. These sensors are subjected to a wide range of gas conditions and comprise four distinct classes (TGS-2600, TGS-2602, TGS-2610, and TGS-2620). The sensors' response signals are recorded with a 100-Hz sampling frequency when a 5-V voltage is applied. This process is repeated continuously for a period of 24 h. The raw dataset contains 417 8504 + 420 8261 = 838 6765 occurrences between ethylene-methane and ethylene-CO.

IV. METHODS
A. 1-D Data Classification 1) Postprocessing and Labeling: In this article, a labeled version of the GSA dataset is shown in Table II. In order to label the data for neural network classification, one-hot encoding is employed, whereby each class is represented by two binary digits. For example, a value of "01" indicates CO gas, whereas a value of "00" indicates that neither gas is present and "11" indicates that both gases are present. Also, a class code from 0 to 3 for each possible class is assigned.
2) 1-D Data Classification Using GasNet Architecture: This section describes the model used in time series gas mixture classification and identification on the GSA dataset, as shown in Fig. 2. In this work, the classification problem is divided into two binary classification challenges, each with its own single label: "ethylene," "ethylene," and "CO," and "CO." The GasNet [21] architecture is comprised of convolutions, batch normalizations, and pooling operations in general. The word "convolution block" is chosen by the authors to describe a blend of convolution layers, activation layers, and batch normalization layers. The VGG-Net [34] is utilized as a model for the GasNet design. The design of the GasNet classification algorithm is shown in Fig. 2. There are a total of 36 layers. Pooling layers divide convolution blocks, and the final convolution block is followed by a global averagepooling layer. The fully connected layer is used at the end, which computes class scores and makes predictions. GasNet's input tensor shape is m × n × 1, where m represents the used sensors quantity and dimensionality for each sensor as n, and the tensor output shape is m/4 × n/4 × 128; after final convolution block for the activations of each feature map, a global average-pooling layer average is used and produces a shape of tensor with the 1 × 1 × 128. Also, the network returns class values for each input vector, which represents four different types of gas concentrations.
In this work, the GSA dataset is split into two subsets: training data (70%) and test data (30%). Hyperparameters are set as follows: the number of epochs is 20 and the batch size is 32. To regularize the model, a 0.5 dropout layer is placed between the fully connected layer and the global averagepooling layer with a learning rate of 0.0001 is used. An early stopping optimization technique is used to avoid overfitting.  B. 2-D Processing and Classification Method 1) 2-D Data Postprocessing: As described earlier, transforming 1-D data into its 2-D counterpart can produce higher quality classification results using less computational resources. Also, 2-D images can reveal characteristics and patterns rarely present in the time series data. In this work, the GAF transformation is employed, after reviewing its benefits in the literature. Using a polar coordinate-based matrix, GAF can retain an actual temporal relationship in the time series images [44]. The following equation shows how to change the original time series (x) so that it falls between 0 and 1: The rescaled data are then encoded into polar coordinates using angular cosine and the time stamp. A symmetrical image layout is achieved by aligning it with the raw time series spanning top-left to bottom-right and by using a single major diagonal. By virtue of this property, the polar coordinates may be transformed back to the original time series. GAF can generate two pictures using a distinct equation. Mathematically, the GASF is specified in the following equations: whereas In this implementation, the GASF is adapted to encode the GAF dataset into a series of 2-D images to take advantage of visually interpreting gas sensor data, in which 1-D time series signals are represented in the form of 2-D images. As described earlier, the GSA dataset is composed of two gas mixtures: CO and ethylene.
To populate the GAF image dataset, 16 GAF images are created every 100 s throughout the dataset from each of the 16 sensor raw readings. In total, 1664 GAF images are created. Fig. 3 shows a sample GAF for a 16-sensor array.
2) AlexNet Image Classification Algorithm: The AlexNet architecture is used for the classification as the image classifier, to identify gas types from generated GAF representations, as shown in Fig. 4. The AlexNet model [33] is a widely used image classification research. It is well known due to high computational efficiency compared to related models depicting the AlexNet architecture for GAF image classification. AlexNet's architecture is made up of five convolutional layers and three fully connected layers.
In this implementation, the model is trained using the GAF dataset and classified into four classes (as shown in Table II). The GAF image dataset is divided into three subsets: training (70%), validation (20%), and test (10%). The images are resized for the model's input size of 227 × 227 × 3 during the preprocessing step. There are 1163 training validation images, and 170 test images are in the separate datasets. Tensors are generated for each subset using a batch size of 32, which is a commonly used number in the literature. After dividing by 255, a normalization procedure is used for each subset. On the AlexNet model, normalized training and validation subsets are used. The test accuracy is calculated separately using the test dataset on the model for prediction. To avoid overfitting and improve the classification accuracy, a learning  The model is subjected to a fivefold CV procedure. However, the overall accuracy is extremely low due to the small size of the dataset, and thus, when a fivefold CV is utilized, each fold includes merely 333 photographs, which introduces overfitting quite quickly. Furthermore, rather than employing CV, most DL models are modified by optimizing hyperparameters to avoid overfitting. Accordingly, an early stopping strategy is employed. To improve the model's performance, different numbers of epochs are used with ten iterations. The average iterated accuracies of each epoch are shown in Fig. 5, and the best outcomes achieved are annotated.

V. RESULTS AND DISCUSSION
In this section, outcomes after the employment of GasNet for TSC and AlexNet GAF data classification on the GSA dataset are compared and discussed.
The GSA dataset has been analyzed using two separate architectures. A total of 1664 GAF images have been produced from the GSA dataset. The AlexNet architecture has been used to train 1163 GAF images. The training performance has been evaluated using validation data consisting of 331 GAF images. Prediction from the model has been obtained from 170 GAF images. On the other hand, a time series data classification is made using the GasNet architecture. Because of the kernel size selection and use of the convolutional block, the GasNet model learns the features more effectively [21].
The outcomes are shown in Table III. Detecting different gas concentrations is achieved successfully using two different architectures. AlexNet's training accuracy for 50 epochs is 86.9%. To assess the model's efficiency, test accuracy is used. The model's test accuracy is 81%. When compared to GasNet, the latter outperforms the former in terms of test accuracy. For the GasNet architecture, the training accuracy is 96.9% and the test accuracy is nearly identical at 96.5% .  TABLE III  CLASSIFICATION RESULTS FOR 1-D AND 2-D DATA   TABLE IV  1 As shown in Table IV, compared with the reviewed literature, the test accuracy using the improved GasNet achieved is on par or even higher than the methods reviewed, particularly [7] (93.0%), [9] (96.0%), [21] (95.0%), and [23] (96.0%).
In contrast, despite relatively lower test accuracy, the GAF classification using AlexNet is carried out in about 1 s, which is quite promising in terms of computational efficiency and in real-time gas classification applications.
Overall, the presented results show that DL, hand in hand with creative data visualization, can result in higher classification performance and can open doors to efficient, real-time gas classification. However, the current implementation can be improved in the following directions.

1) Increase GAF classification accuracy by: 1) optimizing
AlexNet for GAFs and 2) modifying the GAF creation process to include more distinct difference between each instance. 2) The GasNet-CNN architecture in this study is suitable for EN devices, making it suitable for gas identification and classification according to the requirement. The employed GasNet-CNN might be used to increase the performance by using augmentation approaches. 3) Test the implemented models on other publicly available datasets for wider performance benchmarking.
Our future work plans include addressing the limitations above and evaluating the presented models on highperformance edge computing platforms such as the ODROID-XU4 and the Jetson Nano. This evaluation will help understand 1 Registered trademark. the viability of using the presented models for real-time gas classifications.
To improve the reproducibility of results and further performance improvement, the postprocessed time series and GAF datasets alongside the developed code for 1-D and 2-D classification are published as a public repository. 2

VI. CONCLUSION
This article presents a dual classification system for highperformance gas classification: on 1-D time series data and 2-D GAF data. The GSA dataset is used for evaluating the implemented AlexNet model for classifying 2-D GAF data and an improved version of GasNet for 1-D time-based data. Using a cloud-based architecture, the two models are evaluated and benchmarked with the reviewed literature. For the GAF case, the 1-D data are converted into 2-D counterparts by means of normalization, segmentation, averaging, and color coding. Current results signify higher accuracy using GasNet (∼96.5%), a steady improvement upon the literature, while AlexNet achieved 81.0% test accuracy with outstanding computational time of 1 s. Future work includes further model optimizations, improves the GAF creation process, and tests the models on various datasets on high-performance edge computing platforms, enabling real-time gas classification applications for research and development as well as industrial facilities.