A New Real-time Lossless Data Compression Algorithm for ECG and PPG Signals

— Objective: Data compression is a useful process in tele-monitoring applications, in which lesser number of bits are needed to represent the same data. In this work, a run-time lossless compression of single channel Electrocardiogram (ECG) and Photoplethysmogram (PPG) signal is proposed, maintaining all dominant features. Methods: The single channel data are first quantized using optimal quantization level, so that lesser number of bits are needed to represent it, maintaining low quantization error. Then second order delta encoding and run-length encoding (RLE) based data compression are proposed in this work. A new approach of using ‘buffer array’ along with RLE is also introduced, so that minimum bits are needed to store. Results : This algorithm was tested on various single lead ECG and PPG signals available in Physionet. An average compression ratio (CR) was achieved of 6.52, 3.82, and 2.49 for 547 PTBDB ECG records, 48 MITDB ECG records, and 53 MIMIC-II PPG records, respectively. This algorithm was also performed on single channel ECG, collected from 10 healthy volunteers using AD8232 ECG module, with 125 Hz sampling frequency and 10-bit data resolution, which resulted in average CR of 2.34. Discussion: This algorithm was also performed on a smartphone device that provided user-friendly operation. The low computational complications and standalone operation of data collection, compression, and transmission encouraged its implementation for run-time operation. Significance: A comparative study of proposed work with previously published works proved this fact that this algorithm provided better performance in the area of run-time patient health monitoring applications.


I. INTRODUCTION
A. Overview and motivation IOELECTRIC signals are generated due to continuous functioning of various physiological processes in human body. Among these processes, cardiovascular system (CVS) is one of the most important biological systems, which is caused due to continuous contraction and relaxation of human heart. Electrocardiogram (ECG) and Photoplethysmogram (PPG) are the most important bioelectric signals that are generated in CVS. The ECG (both single and multichannel) is a representation of full cardiac cycle, measured by placing electrodes at specific positions of human body [1]. The PPG is a single channel measure of blood volume in each cardiac cycle, measured using optical radiation, coming through skin (mainly in fingertip) [2]. Cardiovascular diseases are one of the most common factors of death worldwide, as mentioned by World Health Organization (WHO) [3]. Therefore, accurate measurement, storing, and transmission of these signals are very necessary. Now, long term data storage needs huge memory space, affecting the storage efficiency and transmission channel performance. Data compression is only way to overcome these problems. There are two types of compression processes, viz. lossy and lossless. In lossy compression, though compression ratio (CR) is higher, due to loss of unimportant information, reconstruction error arises; but in lossless compression, no data is lost, resulting in lower CR. In various medical applications, lossy data compression process is not acceptable as there may be loss of any important feature. Due to this fact, in this work, a completely lossless run-time compression process of ECG and PPG signals is proposed that can also be used after any lossy compression. In physiological signals, there are two types of redundancy, viz. interbeat redundancy, present within a beat, and intrabeat redundancy, present within each single beat [4]. Elimination of these redundancies results in good compression performance.

B. Related works on lossless ECG compression
There are many published papers on lossy compression implemented upon ECG signal on single lead [5] and multilead data [6]. Very few papers are available on lossless compression, and the number of works reported on lossless PPG compression, is even less. Quantization is a process to convert the fractional values of physiological data into quantized numbers. Vector quantization based lossless compression using set partitioning in hierarchical tree (SPIHT) model is able to provide high CR [7]. Linear prediction (LP) [8] is a very useful tool for lossless ECG compression, where future data are forecasted from past samples, in any time-series data. This prediction somehow helps in lossless encoding techniques like Huffman coding, variable length coding [9], entropy encoding (EC) etc. Huffman or EC is very popular lossless coding, in which specific codes are provided to repeating data/characters. Now, compression is performed by assigning smaller codes to high frequent characters and higher codes to those data, which have low frequency of appearance. In some works, golomb-rice

A New Real-time Lossless Data Compression
Algorithm for ECG and PPG Signals Soumyendu Banerjee, Graduate Student Member IEEE, and Girish Kumar Singh B coding (GRC) [10] was used as primary compressing tool by providing optimal prefix-codes to ECG signals, mainly in hardware applications [11]. This process assisted in data compression and transmission in run-time using suitable packet formation [12], [13]. Multilead ECG lossless compression is also proposed using fast Levinson-Durbin recursion techniques [14] in recent days. Using LP, error modeling, and GRC sufficiently improved the CR [15]. ASCII character codingbased compression was introduced by Mukhopadhyay et al. in [16] that caused low reconstruction error, but good CR.
C. Related works on PPG compression PPG compression research is not a vast aspect till date. Among the published works, mainly lossy compression is exploited. Fourier transform is one of the lossy algorithms that converts any time domain signal into frequency domain. Selection of dominant coefficients results in good CR, but generates reconstruction error [18]. In tele-monitoring applications, Mukhopadhyay et al. proposed ASCII character coding, to compress PPG data [19]. In recent works, steganography [20] along with PPG compression using ASCII coding and SVD was proposed in [21]. This paper achieved high CR, but at the cost of comparatively higher PRD. Though PRD was not so high, all these works were lossy. In [22], author proposed run-time PPG compression using Huffman coding, with low PRD, which can be assumed as quasi-lossless. To the best of the authors' knowledge, no published work is available on developing exact lossless compression in PPG signal.

D. Contributions and paper organization
Therefore, in this work, a new lossless compression process for biological signals (which are periodic in nature), is proposed. The compression result was tested using various ECG and PPG signals in run-time operation. Hence, the main contributions of this proposed work are; 1) Proposal of a new lossless compression procedure using second order delta encoding (DE) and run length encoding (RLE). 2) Introduction of the usage of buffer array in RLE strategy to improve CR. 3) Proposal of an improved approach of lossless compression for any biological signal (irrespective of sampling frequency and data resolution) that have intersample redundancy, assisting in medical science and clinical application.
In addition, the experimental results were tested on ECG and PPG signals to corroborate the fact that this algorithm can be utilized in any biological signal that has inter-sample redundancy. A run-time data collection using hardware module (Raspberry Pi, AD8232 ECG module and MCP3008 ADC chip) and performing compression /decompression algorithm on those signals, in python programming platform and transmission to cloud server using active internet facility, were accomplished. For user-friendliness, the algorithm was evaluated on a smart phone device too.
The rest of this article is organized as follows, Section II describes the hardware setup and proposed method of compression process, Section III shows the experimental result performed on various ECG and PPG signals along with detailed discussion, and finally, Section IV concludes this article.

A. Hardware setup and connection scheme
The run-time data collection, data processing and transmission were performed using Raspberry Pi-4 (R-Pi) [23] and AD8232 [24] single channel ECG module (commercially available). The AD8232 is a heart rate monitoring module that operates on 3.3 Volt DC voltage source. This device can provide single channel ECG output (analog) by placing three electrodes on human body (left arm, right arm, and right leg). The R-Pi is a single board computer device with 1.5 GHz and 64-bit quad-core Cortex-A72 processor and 4 gigabyte random access memory (RAM). The Rs-Pi executed the compression and reconstruction procedure after collecting the data. As a result, any other hardware device with adequate internal storage capacity and a data processing unit (e.g., Banana Pi, ASUS Tinker Board S etc.) can be used in place of the Rs-Pi. Since the proposed approach was developed in python, the code must be run with the appropriate software that supports these other devices. Now, AD8232 provides analog output, but R-Pi can deal with only digital signal; therefore, an analog to digital convertor (ADC) module, MCP3008 was utilized to convert the analog output into digital signal with 10 bits per sample resolution and was fed into R-Pi [25]. The connection scheme of various components is shown in Fig. 1(a) and hardware set-up is shown in Fig. 1(b). The wire description is explained in Table I, mentioning direction of data flow. In this Table, 'R{ε}', 'A{ε}' and 'M{ε}', represent the ε-th pin of R-Pi, AD8232 and MCP3008 module, respectively. The wire numbers are made similar in both Table I and Fig. 1(a), for ease of understanding. To perform run-time data transmission, the compressed data were transmitted wirelessly and stored in cloud storage. For reconstruction, the same data were to be downloaded and decompressed. This proposed algorithm was also performed in smart phone devices. A snapshot of received compressed data and reconstructed ECG signal on smart phone screen, is shown in Fig. 1(c). Fig. 2 shows the basic structure of both ECG and PPG beat [26]. In this paper, sample to sample coherency of these signals was eliminated using DE and RLE, resulting in high CR with no data loss. The entire algorithm is shown in Fig. 3. At first, the single channel signal was quantized using quantization level 'n' (provided by the user). Then, second order delta encoding was performed on this signal, followed by the proposed method of RLE using buffer arrays. The amount of compression depends upon user provided value β, discussed later. Finally, after header byte generation, the compressed data were either stored or transmitted using active internet, to cloud storage.

C. Noise elimination from raw signal
Due to the contamination of various noise signals, denoising was a necessary process before data compression. For both ECG and PPG signal, a second order bandpass Butterworth filter with lower cut off frequency of 0.5 Hz, was designed to eliminate too low frequency signals, that create baseline artifact. The upper cut-off frequency was 100 and 3.4 Hz for ECG and PPG signals respectively [27], [28]. The ranges of cut off frequency were so chosen that the amplitude ratio (AR), be limited within 10 %, as mentioned in the American Heart Association (AHA) rule [29]. After denoising the raw signal, lossless data compression was initiated as explained below.

D. Proposed algorithm of compression
In this work, to eliminate the intrabeat redundancy, second order delta encoding was performed upon single channel ECG/PPG signal. Initially, the signal was quantized by n-bit quantization level, using the following equation, where, Xi and Yi represent the i-th sample in original signal, 'X' and quantized array, 'Y' of respective signal X; Xmax and Xmin represent the maximum and minimum value of signal X. The aim of quantization process was to   represent each sample in quantized number, so that the delta encoded array contains integer values. Finally, the second order DE was computed by taking sample-tosample differentiation [30] using following equations, (2) where, Δ 1 and Δ 2 represent the first and second order delta encoded array of signal Y, (with signal length 'N'), respectively. The delta encoding is a completely lossless coding technique. To reconstruct the original signal, cumulative addition of consecutive samples of delta encoded array is performed. For this purpose, the first sample of the original signal is necessary i.e., to reconstruct signal Y from Δ 1 , Y1-th sample is always necessary. Thus, after computing this, two samples were stored (viz. Y1 and Δ 1 1,) separately, to finally reconstruct Y. Now, after computing DE, it was observed that due to the presence of internal redundancy, presence of successive 'zero's ('0's) were almost common in array Δ 2 . Thus, to remove the coherency, this repetition was eliminated using the number of times, 'zero' was repeated i.e., run length encoding (RLE) [31]. An example of RLE is shown in Fig. 4, where the encoded array Ɵ contains the number of occurrences of consecutive '0's, in spite of keeping all of them, as present in main array Δ 2 . Now, in run-time operation, it is always necessary to convert this array in binary form, hence each element is needed to be converted into equivalent binary number. Now, the number of allocated bits to each element (which is denoted as bits per sample or 'bpsm' in the rest of this paper) should be the same and it should be equal to the number of bits, needed to express maximum-absolute value (MAV) along with its sign, in that array. In Fig. 4, in Δ 2 array, MAV=16. So, 6 bits are needed to store each element (5 bits to store amplitude and 1 bit to store sign of respective element). Hence, in this example, Δ 2 and Ɵ array needs 138 (23×6) and 96 (16×6) bits, respectively as the number of elements in these two arrays are 23 and 16 respectively; hence the CR = (138/96) = 1.4375. Now, to increase this CR, in this work, implementation of two separate buffer arrays was introduced, as followed.
In Fig. 4, if the number of occurrences of '0's becomes higher than MAV of that signal, then bpsm of Ɵ array goes higher than the number of bits to store the MAV. In addition, reduction of MAV, results in lower bpsm, which ultimately reflects as a smaller number of bits, needed to express the entire encoded array. Keeping consideration of these facts, in this work, two buffer arrays were proposed, viz. 1) buffer array-1, expressed by Ψ, used to store the number of occurrences of '0's, and 2) buffer array-2, expressed by ξ, used to decrease bpsm. Two parameters were introduced as α, and β, which acted as a relationship links between main encoded array Ɵ and the two buffer arrays Ψ, and ξ. Now to reduce MAV, 2 arrays (row vectors) were generated as Tsmp and Tbit, explained below.
The i-th (i=1, 2, 3…J, where J is the number of bits needed to store the actual MAV of Δ 2 array with sign e.g., for the previous example in Fig. 4, J = 6) element of Tsmp contains number of samples present in Δ 2 array that can be expressed by (2+i) or higher binary bits. Let, in the array shown in Fig. 4, Tsmp = [23 2 2 1] (because, total 23 samples can be expressed in 3 or higher bits, hence i = 1th element of Tsmp is 23, similarly, 2 samples can be stored in 4 or higher bits, hence the second element is 2 and so on). In this work, those samples, which needed higher number of bits to express, were stored in ξ array and expressed in J bpsm and those samples were replaced from Δ 2 array by an identification value, 'α', so that Ɵ array can be stored in K bpsm where K ≤ J. Now, to find out the optimal value of K, Tbit array was used, where i-th element of this array Tbit (i) was computed as,  store the entire compressed data. Therefore, the array Δ 2 can be expressed in K (=3) bpsm with α = 2 K-1 = 4, as a replacement of those samples, which were stored in ξ array. To conclude, those samples that needed more than K (=3) bits to express in array Δ 2 (i.e., -16 and 8, in the example shown in Fig. 4), were replaced by α and stored in ξ array sequentially, with updating the Ψ array, as explained below.
Buffer array 2: the main function of buffer array Ψ was to store the number of occurrences of consecutive '0's in array Δ 2 . This array acted as an identifier of sample replacement from Δ 2 to ξ array, mentioned above and used to encode the main array Δ 2 into Ɵ array. Now, the Ɵ array was stored in K bpsm, and Ψ array was stored in β bpsm, provided manually. A simple algorithm depicted in Table  II shows the process of generation of these three encoded arrays (i.e., Ɵ, Ψ, ξ) from delta encoded array Δ 2 .
This algorithm, shown in Table II starts with searching for consecutive zeros in the delta encoded array Δ 2 and every time it starts counting those repetitions in an element 'c'. Now, 1) if more than two consecutive zeros are counted, the compiler divides the number, stored in 'c' by (2 β -1). The reminder and quotient are stored in two scalars denoted as 'r' and 'q'. Now, the entire c number of consecutive '0's is replaced by inserting α in the array Ɵ, repeated by 'q+λ' (λ = 1 if r ≠ 0, otherwise, λ = 0) times and the Ψ array is updated by padding (2 β -1), repeated by 'q-γ' (γ = 0 if r ≠ 0, otherwise, γ = 1) times followed by padding 'r' once (padding of 'r' is performed only if r ≠ 0).
2) if any non-zero element is searched by algorithm in array Δ 2 , and if it is less than α, that element is directly padded in Ɵ array otherwise, 3) if that non-zero element is not less than α, that element was replaced by padding α, once in Ɵ array, '0' is padded once in Ψ array and that non-zero element is stored in ξ array.
While decoding the main array, first, the second order delta encoded array was reconstructed in an empty array expressed by Φ, using three encoded arrays (i.e., Ɵ, Ψ, and ξ). The reconstruction algorithm is shown in Table  III. The compiler starts searching for each element from Ɵ consecutively. Now, if an element is not equal to α, then that sample is padded directly in Φ, otherwise the compiler searches for respective element in Ψ, incrementing the program counter by 1 for this array. If that element is a non-zero number, then respective number of '0's is padded in Φ; if the element in Ψ is equal to zero, then respective sample in ξ array is padded into Φ, incrementing the program counter by 1 for this array. In this way, the second order delta encoded array is generated, from which the original signal was computed as follows,

E. Selection of optimal value for β
It was observed that the CR varied with the variation of β value, for single channel signal. Hence, it was necessary to select the optimal value of β before initiating the compression. Now, the value of β for which, CR was maximum, represented as βm, was dependent on SF, bpsm, and signal length. Therefore, to select the optimal value of βm, a multilayer perceptron neural network (MLPNN) was trained using feedforward backpropagation algorithm and 'sigmoidal' activation function. Table IV shows the hyperparameters that were chosen for this network. The hyperparameter selection was needed in order to choose the best network model, which would increase NN training efficiency, while reducing the convergence time. Eight parameters were chosen for this reason, as shown in column 1 of Table IV, along with the ranges (or choices) in column 2. These ranges/choices were chosen heuristically in order to better suit the network model. The column 3 of Table IV shows the optimal values for these parameters. Now, when training with these network parameters, there was a sudden increase in network loss, which was expressed as mean squared error (MSE), at higher range of training epochs. This might hamper the overall training of NN; As a result, the number of epochs and neurons were manually set to 1000 and 30, respectively, and it was discovered that the training phase was flawless, yielding maximum absolute error (MAE) of 1.09% and mean squared error (MSE) of 0.0325%. These two parameters were calculated between the predicted output of the trained network using the actual input dataset and output set used during training. Fig. 6 shows the training loss using both the optimum hyperparameters and the hyperparameters that have been manually modified. The input layer of neural network (NN) consisted of 1) SF, 2) bpsm, and 3) signal length and the output layer provided only the optimal value of βm, denoted as βm NN , to achieve maximum CR, for the given specifications provided in the input layer. The training dataset were created using the optimal value of βm, measured by hit and trial (HT) method, denoted as βm HT , tested on various ECG and PPG signals; and for testing, separate databases were used to check the accuracy. The details of database and accuracy of NN, are mentioned in Section III.

A. Data collection and noise elimination
Performance of the proposed compression and reconstruction algorithms was tested using various physiological signals like ECG and PPG, which are available in physionet [34]. In addition, this proposed method was tested upon single channel ECG data, which were collected from 10 healthy volunteers, with their informed consent, for run-time operation (i.e., data collection, compression, transmission, and reconstruction). The volunteers were aged between 18 and 57 (details of volunteers are shown in Table V). The data collection was performed in biomedical research laboratory at the department of Electrical   Engineering in Indian Institute of Technology, Roorkee, using R-Pi and AD8232 ECG module, mentioned earlier. Now each signal was first denoised using second order band pass filter with passband frequency between 0.5 and 100 Hz for ECG signals and between 0.5 and 3.4 Hz for PPG signals, so that the AR be limited within 10%. A pictorial representation of denoising process is shown in Fig. 7.
Now, the quality measurement indices of the proposed work are mentioned below, followed by the details of databases and experimental result of each signal.

B. Quality measure indices
In this work, the amount of compression was expressed in form of 'compression ratio' (CR) [32], which is basically measure of number of bits needed to store the compressed data. The expression of CR is mentioned below, bits needed to store original signal CR bits needed to store compressed data = The quality of quantization process is measured using percentage root mean squared difference (PRD), which is a measure of quantization error. This proposed work is completely lossless work. Therefore, in this section, PRD is mentioned as measure of quantization error before compression, and it is expressed as, where, Ui and Vi represent the i-th sample in the original signal, U and the signal obtained after quantization and de-quantization process, 'V', of same length 'L'. It is evident that for good compression work CR should be as high as possible and quantization level should be higher to have low quantization error before compression process.

C. Compression result on run-time data
The AD8232 provides analog output voltage representing single channel ECG by placing three electrodes on human body. The three electrodes, which are colored in Red, Green, and Gray, should be placed on the chest, so that the 'Einthoven's triangle' law [1] be fulfilled. A 3.5 mm ECG connector pin is provided in this module which are inherently connected to three ports mentioned above, providing ease of usage of this module. Now, to convert analog output of this module in digital signal, MCP3008 ADC chip was used that uses SPI bus protocol to communicate with R-Pi. The circuit diagram and connection scheme of proposed work was explained earlier. In this work, the single channel ECG was captured in 10 bpsm with 125 Hz SF from 10 volunteers. The signal length was varied from 2000 to 40000 samples, to compare the obtained result depending upon different values of β, provided manually. The average obtained result is shown in Table VI of these run-time data. It was observed that CR slightly changes with the variation of β. In addition, with increment of signal length, CR increased for same β input.

D. Compression result on PTBDB ECG database
The proposed algorithm was tested on PTBDB ECG database available in physionet [34]. This Pdatabase contains 549 MECG records categorized among various annotation types. Each MECG signal contains 15 simultaneous leads, viz. 12 conventional leads (i.e., I, II, III, aVL, aVR, aVF, V1, V2, V3, V4, V5, and V6) and 3 Frank leads (Vx, Vy, and Vz,), among which only lead II was selected for computation. Each signal is measured in 1 kHz SF and 16 bpsm. These signals were obtained from 290 patients with age between 17 to 87 (with mean age 57.2). Among them, 81 patients were female with mean age of 61.6 and 209 patients were male with average age of 55.5. As this proposed algorithm, started by quantizing the single lead signal with 'n'-bit, each PTBDB record was first quantized using different values of 'n', viz. n = 6, 8, 10, 12, 14, and, 16 and in each process, the quantization-error (after dequantizing that signal with respective value of 'n') is shown in Fig. 7 in the form of PRD, separately. In addition, it was observed that the value of PRD changes with the signal length for same quantization level (Qlvl). Therefore, in Fig. 7, variation of quantization error with respect to varying signal length (with 2000, 5000, 10000, 15000, 20000, and 30000 samples, separately) is shown using different Qlvl mentioned above. It can be concluded that PRD varies in proportion with Qlvl and inversely proportion with signal length. Now, based on these six values of Qlvl, variation of average CR is shown in Fig. 8 with variable signal length and different values of β, implemented upon PTBDB records. Table VII shows the average CR and PRD on PTBDB records, using 16-bit  quantization level with variable signal length and β. It can be observed that using β = 2, highest CR was achieved.

E. Compression result on MITDB ECG database
Separate analysis was performed using 48 MITDB records, which were sampled in 360 Hz frequency and 11-bit data resolution. Each record has 2 channel measurements between which, the lead II was used for result analysis. Each MITDB record was quantized using different Qlvl with 'n'= 6, 8, 10, and 11. As the performance result varies with increment of signal length, compression process was performed using 2000, 5000, 10000, 20000, 30000, and 40000 samples separately using each Qlvl mentioned above. Variation of PRD of each quantization process is shown in Fig. 9. Finally, using each quantization level and signal length, separate result analysis was performed with different values of β, viz. 2, 3, 4, 5, and 6, which is shown in Fig. 9. From this figure, it is clear that CR decreases with the increment of Qlvl. It can be observed form Fig. 9, PRD is very low for Qlvl with 10 and 11 and using these Qlvl, highest CR was achieved using β = 3, shown in Fig. 10. Table VIII defines the variation of CR and PRD, tested upon 48 MITDB records, using variable signal length and β, with 11 bpsm and 360 Hz SF. From Table VIII, it can be concluded that CR is highest for β = 3 and both CR and PRD increase with the increment of signal length.

F. Compression result on MIMIC-II PPG database
The BIDMIC or MIMIC-II database contains 53 PPG records with 125 Hz sampling frequency. These signals were taken from 53 adult patients with mean age 64.81, among which 32 were female. The proposed algorithm was also tested on this entire dataset and the compression process was initiated by quantizing each signal using various Qlvl, as performed for ECG signals. The PRD is shown in Fig. 11, using separate Qlvl of value 6, 8, 10, and 12, each implanted on signal length 2000, 5000, 10000, 20000, 30000, and 40000 separately. Now, based on these quantization levels and signal length, variation of CR using different β is shown in Fig. 12. Finally, the CR using 12bit data resolution and variable signal length and β value is shown in Table IX. From this table, it can also be concluded that the CR and increases with the increment of signal length and with decrement of Qlvl.

G. Training of MLPNN to obtain optimal value of β
The performance of the algorithm was dependent on four parameters, viz. SF, Qlvl, signal length and β. The SF and Qlvl was dependent on the device, used for data collection. The length of signal, used for processing, was dependent on the user. The βm, was varied with the change of SF, Qlvl and signal length. In this work, β was chosen by hit and trial (HT) method between 2 and 6 and for each value, separate result analysis was made. Table X shows the optimal values of βm HT for ECG and PPG signals with different SF, Qlvl, and signal length. The values of β are shown in 'bold' numbers within this table for ease of understanding. It was observed, that for every signal, value of β increases with the decrement of Qlvl and with increment of signal length. Using these signal specifications and βm HT values, the MLPNN was trained as mentioned earlier using 150 training datasets with 10-fold cross validation, with a final maximum absolute error as 1.09%. The training and validation dataset is defined in detail in Table XI. For the same SF, signal length, and Qlvl, it was discovered that the value of β was almost fixed. Furthermore, as the signal length increased (above 10000), the value of β remained constant for each group (i.e., signal type, SF and Qlvl). As a result, a nearly equal number of records were chosen for each signal type to ensure uniform training, with the signal duration not exceeding 12000. The training (using 150 records) was done with 10-fold cross validation, which divided the dataset into ten classes (15 records in each group). The rest group was used to analyze the output of trained NN on new input data after the NN was trained using successive 9 groups. The final CV accuracy of the 10 models were obtained as 0.823, 0.889, 0.967, 0.867, 0.833, 0.942, 0.965, 0.833, 0.933, and 0.965. Though the term accuracy is a qualitive concept, in NN prediction process, accuracy is calculated as the ratio of number of correct predictions to the total number of predictions [33]. As a result, greater accuracy equates to better training. The mean loss of these models is shown in Fig. 6. It is clear that the accuracy of third model was best hence, this model was finally selected as   trained NN. Now, after completion of training, the blind validation accuracy was measured using 96 records, shown in Table XI. The overall accuracy, measured using these records was approximately 0.953. This CV technique was used to improve the training accuracy. Now, using this trained NN, the compression result was compared using both βm HT and βm NN , as mentioned below.

H. Testing of accuracy of MLPNN
To test whether the NN was able to provide βm, one database, each of ECG and PPG signal, was used. The European ST-T database [34] contains 90 records with 2 channel ECG measurement, sampled in 12 bpsm with 250 Hz frequency. The records were obtained from 79 patients, among which 8 were female (aged 55-71) and 70 were male (aged 30-84), within which the lead-II was used for computation. The IEEE signal processing cup challenge (SPCC) 2015 database [35] contains six channel records of various physiological signals, among which the PPG signal (channel-2) was used for testing. The records were measured from various patients aged between 18 and 35, with 125 Hz sampling frequency. From these two databases, 15 records were arbitrarily selected from each (15 ECG and 15 PPG) for testing, hence the training and validation dataset were completely different from the test dataset. The experimental result is shown in Table XII. It can be concluded that the values of βm are almost same for both ECG and PPG signals obtained from HT method (βm HT ) and NN (βm NN ), except for ECG signal with signal length 2000. The variation of CR was 3.24 and 3.22 measured by HT and NN method, respectively; hence very low. Thus, it can be concluded that the NN was almost able to provide, optimal value of βm for both ECG and PPG signal.

I. Comparison with existing techniques
Table XIII describes a comparative study of previously published works on lossless ECG compression with the proposed work, tested on 48 MITDB database (except S.K. Mukhopadhyay et al., 2011 [16], which is a lossy work, implemented on PTBDB database). Miaou et al. [7] proposed the SPIHT based lossless compression upon wavelet coefficients of ECG data. A very good CR of 3.02 was achieved in their work. LP is a very useful tool for lossless compression used in various works by predicting future samples of any time domain series, from previous samples. Huffman encoding based two stage new encoding strategy, used in [8] along with prediction, resulted in CR of 2.43. A modified length coding using multistage adaptive region predictor [9], resulted in increased CR of 2.67. A new short-term LP based lossless compression was proposed by Li in [9], in which the CR was slightly decreased to 2.28. Hardware implementation of QRS detection and lossless compression was proposed by C. Deepu et al. [10][11]. This work was implemented of MITDB ECG database which resulted in CR of 2.15. T. Tsai [12] et al. introduced GRC along with adaptive LP topology for lossless compression. For run-time decoding purpose, they utilized   As compared to previous techniques, the proposed work introduced RLE and DE, for lossless compression of ECG signal. The computational complications were also very low in this work. A new technique of buffer array on RLE, was proposed, which provided high CR of 3.82 for MITDB records with 11 bpsm and 360 Hz SF. The algorithm was performed on R-Pi module with run-time ECG data collected using AD8232 ECG module. The algorithm was also performed on smart phone, which provided a better and easier way of usage.
In [16], author used ASCII character coding-based ECG compression implemented upon PTBDB database with CR of 7.18. This work was not completely lossless, as a very low PRD (generated due to compression) of 0.023 was obtained in their work. T. Tsai et. al [17] proposed a complete lossless compression using adaptive LP and GRC, implemented on PTBDB database which resulted in CR of 4.073. As compared to this work, implementation of the proposed algorithm on 549 PTBDB records provided complete lossless result with CR of higher than 6.52.
In the area of PPG compression, though there are few papers, but all are almost lossy work. Reddy [18] used Fourier transformation for reducing motion artifacts and data compression of PPG signal. ASCII character coding-based Huffman coded PPG compression was proposed in [19] which is still not lossless. Along with PPG compression, in [21], author proposed steganography to hide confidential information within the compressed data which resulted in PRD of higher than 7%. In recent work, R. Gupta proposed [22] Huffman encoding based run-time lossless PPG compression, which resulted in CR of 1.74 and PRD of 0.055. As compared to previous works, this proposed work introduced a complete lossless compression of PPG signal using second order DE and RLC with buffer array, which provided CR of 2.49 with 12 bpsm and 125 SF for MIMIC-II database.

J. Discussion
The algorithm proposed in this work presented a complete lossless compression work that can be implemented on various biological signals. All the previous works were either tested on single type of ECG (lead-II) or PPG signal. The proposed work was tested on 3 ECG and 2 PPG database; hence, it is evident that this algorithm can readily be applied to physiological signals with interbeat coherency, irrespective of sampling frequency and bpsm. The DE basically eliminated the redundancy and the buffer array based RLE, increasing the CR without generating the PRD. For run-time operation, these data needed to be quantized, and it was observed that the error was reduced by providing higher Qlvl. On the other hand, CR and quantization error, both increased with the increment of signal length. To analyze the run-time performance, hardware implementation of the proposed work along with data collection, compression/de-compression and run-time data transmission were proposed using R-Pi. This algorithm was also performed in smart phone device, and data reconstruction procedure after downloading the compressed data, is shown in Fig. 13, along with quantization error curve. The physiological signals are highly noisy in nature and with high noise contaminated signals, performance of the proposed algorithm might be weaker. To analyze this of proposed algorithm, 20 signals were arbitrarily selected, each from PTBDB, MITDB and MIMIC-II databases (total 60 signals) and white Gaussian noise was added to each signal with various noise power (represented by SNR=[10log10{(signal power)/(noise power)}] in dB). The compression result is shown in Table XIV. It was observed that CR decreases with the increment of noise power (it is worthwhile to mention that for each signal, CR was measured for same βm, for different noisy signal). Though increment of CR was proportional to signal length, rate of increment, gradually reduced with the number of samples increased. In addition, it was observed that CR varied with the varying input value of β. Now, the value of β was provided manually, and therefore the result analysis was so performed to analyze the CR with varying β. For   work proposed a complete lossless compression, the reconstructed signal was exactly the same as original signal, resulting in no distortion of clinical features. To analyze this fact, for each signal, sample to sample error signal was computed between the original and reconstructed signal (in digital domain) which was a straight line with constant value of 'zero'; hence it ensured this fact that the various characteristics domains of ECG and PPG were intact and all the samples were in their exact value after reconstruction.

IV. CONCLUSION
In this work, a run-time lossless data compression of single channel ECG (lead-II) and PPG signal is proposed, based on delta and run length encoding. The main aim was to reduce the intrabeat redundancy, hence this algorithm can also be implemented on any other signals, which have intra sample coherency. The compression result was tested on universally accepted database (MITDB, PTBDB, and MIMIC-II) available on physionet. In addition, a run-time data collection was performed using AD8232 ECG module and R-Pi, and data transmission was also performed using active internet facility. The ECG module provides Lead I and thus it can be concluded that the proposed algorithm is independent from lead category; however, result analysis, and comparison were performed using standard ECG (Lead II) and PPG records. In this work, no feature extraction was performed or no data loss occurred; hence, the run-time data, those were collected from healthy volunteers, needed no approval of medical authority for quality and acceptability analysis. This algorithm was also performed in smart phone device, thus providing easy usage to doctors and physicians. The entire algorithm was performed in python programming platform, and therefore tested on Rs-Pi, thus providing an easy and standalone operation. As physiological signals are not completely free from noises, noise sensitivity analysis by adding white Gaussian noise of various noise power, was performed but result comparison was unable to perform in this aspect due to the lack of sufficient information. However, the comparative analysis with the existing works indicated that this procedure provided a better user-friendly, easy, less computationally complicated, lossless data compression and run-time transmission process of both ECG and PPG signal, with high CR and no reconstruction loss.