Theoretical Performance Bounds of Model-Based Electrocardiogram Parameter Estimation

Objective: Clinical parameter estimation from the electrocardiogram (ECG) is a recurrent field of research. It is debated that ECG parameter estimation performed by human experts and machines/algorithms is always model-based (implicitly or explicitly). Therefore, depending on the selected data-model, the adopted estimation scheme (least-squares error, maximum likelihood, or Bayesian), and the prior assumptions on the model parameters and noise distributions, any estimation algorithm used in this context has an upper performance bound, which is not exceedable (for the same model and assumptions).Method: In this research, we develop a comprehensive theoretical framework for ECG parameter estimation and derive the Cramér-Rao lower bounds (CRLBs) for the most popular signal models used in the ECG modeling literature; namely bases expansions (including polynomials) and sum of Gaussian functions.Results: The developed framework is evaluated over real and synthetic data, for three popular applications: T/R ratio estimation, ST-segment analysis and QT-interval estimation, using the state-of-the-art estimators in each context, and compared with the derived theoretical CRLBs.Conclusion and Significance: The proposed framework and the derived CRLBs provide fact-based guidelines for the selection of data-models, sampling frequency (beyond the Nyquist requirements), modeling segment length, the number of beats required for average ECG beat extraction, and other factors that influence the accuracy of ECG-based clinical parameter estimation.


I. INTRODUCTION
The electrocardiogram (ECG) is a rich source of information for cardiac diagnoses. For decades, a significant branch of biomedical research has been devoted to the reliable extraction of clinical parameters from the ECG. In recent years, new technologies such as wearable healthcare monitoring gadgets [1], ECG acquisition during stress-tests and sports [2], and noninvasive fetal ECG monitoring have emerged [3], which have once again brought about the necessity of accurate ECG parameter extraction, especially in edge computing and highly noisy scenarios.
It can be debated that ECG annotation and parameter extraction are always implicitly or explicitly "model-based". Expert human annotators use visual features such as local peaks, morphological variations, rhythms, level-crossings, slopes, intersections of curves, etc. These features are essentially based on mental or verbalized models of the ECG waveform, which are learned by experience and formalized in textbooks [4], [5].
Machine-based annotations are in turns either structural: based on the same set of models and features used by humans and formalized in mathematical/algorithmic form [6], or statistical: based on novel machine learning (ML) techniques that feed the ECG (or its features), to a learning algorithm or neural network, which implicitly learns the data-model [7].
Mathematical models used for ECG annotations are rather diverse, among which polynomial expansions [8]- [10], and sum of Gaussian functions [11]- [13] are the most popular. Standard ECG annotation software use combinations and intersections of such models to define wave peaks, fiducial points, onsets, offsets, DC levels, durations, etc., which inturns are used for diagnostic and clinical purposes by experts and algorithms.
From the estimation theoretical perspective, model-based parameter estimation in presence of measurement noise and model inaccuracies has theoretical bounds, which may not be exceeded by any estimation scheme. The Cramér-Rao Lower Bound (CRLB) is the most popular of such bounds, which imposes a lower bound on the mean square error of any estimator, for a presumed signal-noise model [14]. The estimation of ECG parameters (either by experts or machines) is not an exception, and to date, not many attempts have been made to formulate the intrinsic theoretical limitations and accuracy bounds of model-based ECG parameter extraction.
In this work, we study the problem of ECG parameter extraction from the estimation theoretical perspective and present a systematic approach to the estimation of arbitrary ECG parameters. It is shown how clinical ECG parameter estimation can be translated into a formal estimation problem.
Within this framework, we focus on the most popular models used in ECG parameter extraction, namely linearin-parameters (specifically polynomial expansions), and sum of Gaussians. It is later shown how these models perform in standard parameter estimation frameworks: Least Squares (LS), Maximum Likelihood (ML) and Bayesian. It is shown that although the selection of the data-model and estimation schemes are arbitrary (and application dependent), the choice has inevitable consequences on the attainable performance bounds, regardless of the algorithm used for parameter estimation. We specifically derive the CRLB for the most popular ECG models and present some common rules of thumb for improving the CRLB of ECG parameters. As case studies, the proposed framework is used to study the problems of estimating T/R ratios, ST-segment analysis, and QT-interval estimation, using state-of-the-art ECG parameter estimators. The numeric estimation performance bounds are compared with the derived CRLB.
The results of this study are shown to have significant applications for computerized and non-computerized annotations and highlights the fact that ECG parameters should always be reported with confidence intervals, and are only valid within the scope of the utilized ECG model and estimation scheme.
In Section II, the required estimation theoretical background are reviewed and the problem of clinical ECG parameter estimation is formulated as an estimation problem. In Section III, we derive the CRLB for well-known ECG models, followed by some practical extensions of the framework in Section IV. In Section V, several practical ECG parameter estimation problems are studied and the performance of different algorithms versus the derived CRLBs are compared on both real and synthetic data. A detailed discussion regarding the outcomes of this study is presented in Section VI. The paper is concluded with a summary and future perspectives in Section VII.

A. From mathematical models to clinical parameters
Standard ECG delineator software and computerized devices commonly follow similar steps to analyze ECG data. This includes: 1) preprocessing and quality enhancement (baseline-wander removal, powerline cancellation and out-ofband noise rejection); 2) R-peak detection and heart-rate timeseries calculation; 3) ECG beat extraction; 4) morphological feature extraction from isolated or average beats [15]- [17]. The parameter estimation step commonly consists of fitting a parametric model onto the beat (or a segment of the beat).
Suppose that x(t) denotes an ECG beat, an average beat or a segment-of-interest (SoI) of a real ECG, modeled as where t denotes time,x(t; θ) is a parametric model of the SoI, θ ∈ R P is a vector of model parameters, and e(t) is measurement noise. Defining x ∈ R N as the vectorized form of the time samples x(nTs) (n = n 0 , n 0 + 1 . . . , n 0 + N − 1), i.e. a segment of N samples with sampling time Ts (and sampling frequency fs = 1/Ts) starting at an arbitrary initial time t 0 = n 0 Ts,x(θ) ∈ R N as the vectorized form of the model samples x(nTs; θ), and e ∈ R N the vector form of the measurement noise e(nTs), the vectorial equivalent of (1) is: The model parameters θ are related to the clinical parameter of interest, through some transformation. In both manual and computerized annotations, the clinical ECG parameters are commonly characterized in one of the following forms: 1) local peaks of the SoI; 2) an intersection point of two curves, or 3) amplitudes (or amplitude ratios) at some fiducial point. Mathematically, these conditions translate into an equation that relates a clinical parameter γ * to the parameter vector θ: where φ(·) is a function that relates the model and clinical parameters. For example, the R-peak is identifiable as a global peak across the beat. Therefore, taking γ * as the R-peak location and φ(θ, γ) ∆ = d dtx (θ; t), γ * is related to the model parameters vector by solving φ(θ, γ) = 0 for γ. As another example, the onsets or offsets of the ECG components can be defined as the intersection of the baseline, modeled by a parametric function b(t) (which can be a horizontal line in the simplest case) and another modelx(θ; t) fitted over the SoI. Therefore, φ(θ, γ) ∆ =x(θ; t) − b(t), and its root γ * = T (θ) is the clinical parameter of interest. The notion of relating model parameters to clinical parameters can be extended to fiducial points, peak amplitudes, segment lengths, etc., in all cases resulting in a transformation of the form γ * = T (θ) that relates the model and clinical parameter of interest.
Note that the existence of a unique solution for (3) is an implicit assumption of model-based ECG parameter extraction. Therefore, although φ(θ, γ) = 0 might mathematically have multiple roots, the optimal γ * is selected from constraints on the clinically admissible set of parameters (e.g., constraints such as the max/min ranges, sign, width, etc. of the parameter).

B. Overview of estimation schemes and performance bounds
For notational purposes, we review some of the required backgrounds from estimation theory. Generally, the influential factors in model-based estimation performance are: A) a (parametric) data-model, B) stochastic assumptions on the model parameters (priors), C) stochastic assumptions on the observation noise (posteriors), and D) an estimation algorithm. Depending on the presence or absence of fact-based priors and posteriors, the following estimation schemes can be used.
1) Least squares: When only a data-model is available, without priors and posteriors on the stochastic model parameters and observation noise, the least squares (LS) family (linear, nonlinear, weighted, constrained, recursive, etc.) is the most common estimation scheme. For the additive noise model in (2), the LS estimate of the parameter vector θ iŝ A modification of the standard LS scheme is when the noise is zero-mean with a presumed covariance matrix Σe = E{ee T }. In this case, weighted least squares (WLS) can be used: Note that Σe is generally non-diagonal (corresponding to colored noise). The intuition behind (5) is to whiten the additive noise and to weight the noisy observations with the inverse of their noise variances (giving smaller weights to the noisier samples).
2) Maximum likelihood: Maximum Likelihood (ML) is applicable when in addition to the data-model, the observation noise distribution is also presumed, without having the model parameters' prior distribution, or when the model parameters are deterministic. ML seeks the parameter vector θ, such that The CRLB can be used to find a lower bound on the minimum mean square error (MMSE) estimate of the parameter vector θ. Accordingly, the covariance matrix ofθ(x), an arbitrary unbiased estimator of the parameter vector (i.e. Ex{θ(x)} = θ), satisfies where I ML is the Fisher Information matrix [14]: θ denotes the Hessian matrix operator, and A B denotes (A − B) being a non-negative definite matrix. An estimator that reaches the CRLB is an efficient estimator.
3) Bayesian scheme: Bayesian estimation is the most complete estimation scheme in the reviewed trilogy, as it uses the distribution of both the model parameters and the observation noise (whenever available), to minimize an explicit cost function. If L(·) is a symmetric non-decreasing function denoting the cost of estimation error as a function of θ−θ , a Bayesian estimator seekŝ In a Bayesian estimation scheme, the CRLB is is the Bayesian Fisher Information matrix and f (x, θ) denotes the joint probability density function of observations and parameters [18]. Replacing the joint distribution with the conditional form results in If we define the prior Fisher Information as then using (8), the Bayesian Fisher Information is [18, p. 84]: which shows the contribution of priors (whenever available) in reducing the CRLB, as compared with the ML framework.

C. Estimators for Gaussian noise and model parameters
The CRLB only places an upper bound on the estimation performance, but does not generally guide us towards an efficient estimator, for different models, noise and parameter distributions. In the sequel, we review estimation schemes that assume Gaussian noise and model parameters, which are the most common in ECG parameter estimation.
1) Maximum likelihood estimation: When the measurement noise is explicitly assumed to be Gaussian distributed e ∼ N (0, Σe), the WLS scheme in (5) is also the ML estimator [19], [20]. In this case, the gradient of the log-likelihood function, as required for calculating the CRLB from (7) is where ∂x(θ) ∂θ = ∂x1(θ) T ∂θ , . . . , ∂xN (θ) T ∂θ T , and the Fisher Information matrix is obtained as follows (cf. [18, P. 83]): which has an interesting interpretation in terms of the signalto-noise matrix of the noisy data-model (1) (cf. [21] for the definition of an SNR matrix). For later reference, for the special case of linear-in-parameter modelsx(θ) = Hθ with zero-mean Gaussian noise, we obtain and the Fisher Information is is the matrix of P bases vectors used for modeling the SoI x. For nonlinear in parameter models with additive Gaussian noise, the ML estimate reduces to a nonlinear LS problem, and the general nonlinear case with non-gaussian noise is solved by iterative methods such as Expectation-Maximization (EM), which converge to the ML solution under some general conditions [22].
2) Bayesian estimation: In addition to the data-model and noise distribution, a full Bayesian framework requires the parameter distribution and an optimization criterion (i.e. a cost function to minimize). As a popular case, if the parameter vector is normally distributed, i.e. θ ∼ N (µ θ , Σ θ ), the optimal estimator, which minimizes the cost of error (9) satisfieŝ which is in the form of generalized Tikhonov regularization [23]. Under the Gaussian distribution assumption, the maximum a posteriori (MAP), the minimum mean square error (MMSE) and several other well-known estimation schemes result in the same estimator [18,Sec. 2.4]. Therefore, in the sequel MAP and MMSE estimation schemes become interchangable.
For linear-in-parameters models of the formx(θ) = Hθ, (17) results in a closed form solution [24]: However, for the nonlinear models such as sum-of-Gaussians, the solution is practically obtained by using nonlinear LS algorithms. For this, one first defines the vector: So, (17) is equivalent toθ BYS = arg min θ {y(θ) T y(θ)}, which can be solved by nonlinear LS algoithms such as trust-regionreflective [25], or levenberg-marquardt [26]. In the sequel, all nonlinear schemes (LS, ML and Bayesian) have been implemented by the trust-region-reflective algorithm.

MODELS
In this section, we derive the ML and Bayesian CRLBs for linear-in-parameter and sum-of-gaussian expansions, which are the most popular for ECG (or ECG segments) modeling.

A. Linear-in-parameter functional expansions
Linear-in-parameter models can be formulated as where h(nTs) = [h 1 (nTs), h 2 (nTs), . . . , h P (nTs)] T (n = n 0 , . . . , n 0 + N − 1) is the vector of elements (spanning set) and θ = [θ 1 , θ 2 , . . . , θ M ] T denotes the vector of parameters. We are specifically interested in families of h(nTs), which form a basis for the sampled ECG signal space. This would guarantee that "any" ECG would have an expansion in the form of (20) and there exists a parameter vector θ, which satisfies this equation. Since the ECG is a finite energy signal (both in continuous and sampled spaces), arbitrary basis expansions for the Hilbert space can be used in this context. Using (8) and (15), the Fisher Information for linear-inparameter models is obtained: Specifically, if h(·) form an orthornormal basis, they satisfy

B. Sum of Gaussian functions
Sum of Gaussian (SoG) functions are very common in ECG modeling [11]. Due to their morphological similarity with the ECG complexes, the ECG can be accurately modeled with a few Gaussians. Although SoG functions are neither orthogonal nor linear-in-all-parameters, they form a frame for energy signals [29]. Moreover, due to the similarity of the ECG complexes with Gaussian functions, the parameters of the fitted Gaussian functions can be related to clinical parameters of the ECG by simple transforms [13]. The SoG model can be used to synthesize a beat or segment of an ECG as follows: T is the parameter vector, g i (nTs) denotes the Gaussian functions, and in accordance with our original data-model (1), M = P/3) is the number of Gaussian functions, with peak amplitude a k , effective width b k , and centered at c k . The model (23) is linear in the Gaussian amplitudes and nonlinear in the center and width parameters. The Fisher Information matrix I ML (θ) ∈ R 3M ×3M consists of M sub-matrices corresponding to each of the Gaussians: Using (15) and definining ∆ i (n) ∆ = (nTs − c i ), the elements of the Fisher Information matrix are found as follows 1) Full-wave segregated Gaussian fitting: In deriving (25) we have made no assumptions on the number or center of the Gaussian functions. Considering that SoG has been extensively used for full-wave ECG morphological modeling [10], [11], [30], the above equations can be simplified to obtain more tangible rules of thumb regarding the Fisher Information of the well-distanced Gaussian functions (e.g., corresponding to the P, QRS, and the T-waves of the ECG). Specifically, if (i) the Gaussians are well-distanced as compared with their effective widths, i.e.
the signal is significantly over-sampled, i.e. fs/fmax ≫ 2, where fs and fmax are respectively the sampling and the maximum frequencies of the signal (permitting the summations to be approximated by their corresponding Riemann integrals), and (iii) the SoI covers the effective segments of all the Gaussian functions (summation of the Gaussian residuals outside the SoI can be neglected), ] ij ≈ 0 and for i = j, the following approximations hold: 2) Half-wave Gaussian fitting: Half-wave Gaussians have been shown to be useful for modeling asymmetric ECG waves such as the P-and T-wave [13], [31]. Suppose that a Gaussian function is used to model a half-wave ECG and as in Section III-B1, the signal is significantly over-sampled, and the other Gaussians are far enough. Then the Gaussians significantly dwindle outside the modeled SoI. Therefore, for It can be seen from (26)- (29) that as long as the SoI covers the entire modeled wave (the requirement for approximating (25) with (26)), the centers of the Gaussian functions do not appear in the CRLB bounds. The practical implications of the Fisher Information and CRLBs are later discussed in the results and discussions.

A. Performance bounds on clinical parameter estimation
The CRLB of the mathematical model parameters can be transformed to CRLBs on the clinical parameters. Following the background in Section II-A, if γ = T (θ) denotes the mapping between the model and clinical parameters, the ML CRLB of γ is obtained as follows [14], The same equation holds for the CRLB in the Bayesian framework. While γ = T (θ) might not generally be analytically available (due to a non closed-form solution for (3)), the CRLB of the model parameters vs clinical parameters can be numerically calculated through numeric approximations of the gradient in (30), by using Sequential Monte Carlo techniques.

B. CRLB in multi-beat estimation
ECG parameter estimation from more than one beat is a common technique to obtain more accurate parameter estimates. Many ECG analysis software use an average, weighted average, or median of multiple beats to form a 'template beat' used for estimating parameters such as the QRS width, QT interval, etc. [32]. The R-peak is commonly used as the reference point for aligning the multiple beats. The basic assumption is that the ECG morphology is rather consistent and the ECG deviations and noise are uncorrelated across the beats. Under these conditions, it has been shown that the SNR of the template beat is improved by a factor K, by synchronous averaging over K beats [33]. Regardless of the estimation algorithm (averaging, weighted averaging, median, etc.), we show how using multiple beats generally improves the CRLB.
Extending the data-model in (2) to multiple beats, we denote each ECG beat (or segment of interest) of length N as where e (k) is the beat/segment noise, and all K beats/segments share the same signal modelx(θ) parameterized by the parameter vector θ (deterministic or stochastic). Next, let be the set of all K beats. If the noises of the ECG beats are stochastically independent, we have where fe k (·) denoted the pdf of the noise vector in beat/segment k. Therefore, the Fisher Information of the total observation set X equates where I (k) ML (θ) denotes the ML Fisher Information of beat/segment k. Specifically, if e (k) ∼ N (0, Σe k ) we obtain which is a multi-beat generalization of (15). The same result extends to the Bayesian Fisher Information. It is apparent that involving more observations (beats) can improve the performance bound of estimation for the underlying parameter of interest. As a special case, if the covariance matrices of the additive noise are the same for all the K beats/segments, the Fisher Information in multi-beat experiment is exactly K times the single-beat estimation case.

C. CRLB for multichannel ECG
The heart is a spatially distributed signal source. Therefore, multi-lead ECG (commonly between six to twelve leads) is very common in clinical trials and each ECG channel is best used for measuring a specific cardiac activity. A state-of-theart question is "how many and which ECG channels are most informative for extracting clinical ECG parameters? [34]". As shown in Fig. 1, while there is a lot of shared information between different ECG channels, there are also significant differences between the channel-wise waveforms.
To set up a model-based multichannel ECG parameter extraction framework, one can model each channel independently and fuse the model parameters of each channel in the overall clinical parameters at a second step. To extend the data-model in (2) to L-channel recordings, we denote: x l =x l (θ l ) + e l , (l = 1, . . . , L) (35) wherex l (θ l ) is the signal model in channel l, parameterized by the parameter vector θ l ∈ R P l with P l parameters perchannel, and e l is the noise vector in channel l (notice the notation difference with the multi-beat model in (31)). Next, extending (3) to multichannel, the clinical parameter of interest is related to the channel-wise parameters in vector form: where Φ(·) relates multi-lead parameters with the clinical parameter of interest.
Note that in real-world multi-channel ECG, the per-channel model parameters θ l and ECG noise in different channels are rarely independent across the the channels. Therefore, the total multichannel Fisher Information is not additive as in the single-channel multi-beat case derived in (33), and the derivation of the CRLB requires per-model customization.

A. Robust T/R amplitude ratio calculation
The T-to R-wave amplitude ratio (known as the T/R amplitude ratio) is an important clinical index in ECG-based studies [35]. The calculation of this ratio in noisy data can be very challenging. As our first case study, we show how a model-based approach to T/R estimation, can significantly improve the accuracy. A naive approach for calculating the T/R ratio, is to detect the R-and T-wave peaks (locally visible from the ECG) and to calculate their ratios. A more robust approach is to fit a parametric model over the ECG segments corresponding to the QRS and the T-waves and to calculate the T/R ratio from the fitted model. For this purpose we adopt a simple polynomial model of order P , using (22). In order to calculate the T/R ratio, we fit two separate models over the QRS and T-waves and the model coefficients are obtained from (16), using the LS scheme (or an ML scheme with Gaussian noise assumption). In Fig. 2, a typical noisy ECG and the models fitted over the QRS and T segments are shown. The model order was empirically set to P =6 for this example and the QRS and T-wave models were fitted over windows of lengths 50 ms and 250 ms around the QRS and T-wave peaks, respectively. For a statistical study, 627 ECG beats recorded from different channels were randomly selected from the PTB Diagnostic ECG Database [36]. White Gaussian noise (WGN) with a signal-to-noise ratio (SNR) of 10 dB was added to each beat and the T/R amplitude ratios were calculated from the original and noisy beats, with and without polynomial model fitting. The polynomial model order was P =6 in all cases.
In Fig. 3, the scatter plots of the T/R ratios are compared between the original and noisy beats, with and without model fitting. It can be seen that the results with polynomial model fitting are much better aligned along the identity line. Statistically, the Spearman's correlation (ρ) increased from 0.567 to 0.966 after using the model-based approach, which indicates that the model has made the calculation more robust to noise. The Spearman's correlation was also calculated for a sweep of SNR ranging from 0 dB to 30 dB in 2 dB steps, as shown in Fig. 3(d), which demonstrates how the method performs in different SNR. The impact of the model-based scheme is more significant in lower SNRs.

B. ST-segment parameter estimation using polynomials
The ST-segment and its morphologic properties contain significant clinical information regarding heart defects such as myocardial ischaemia or infarction. The ST-segment is a relatively smooth (and close to flat) segment of the ECG, enabling its modeling with low-order polynomials. Specifically, the coefficients of the zero-, first-and second-order polynomials have been associated to level, slope and scooping of the STsegment, respectively, which are clinically informative [37].
The ST-level is commonly obtained by selecting a specific point from the ST-segment [38], or for better robustness, an average of a few points/segment of the ST-segment [39]- [41]. Mathematically, these correspond to estimating the DC level (average) of the ST segment. To find the appropriate sample, or range of samples for averaging, the estimator needs some fiducial points as reference, which are commonly obtained relative to landmarks such as the R-peak, the J-point (end of Swave), etc. Therefore, these methods use the same data-model, in which a zero-order polynomial (a horizontal line) is fit to the ST-segment. More advanced methods, use non-zero-slope line fits or higher order polynomials, which provide additional clinical parameters such as the ST-slope and the ST-index [38], [42].
1) Theoretic CRLB: For this experiment, the ST-segment is modeled by a line. If the center of SoI is chosen as the reference time t = 0, the ST-level and ST-slope can be considered as the line's DC (θ 0 ) and slope (θ 1 ) parameters. The CRLBs for these parameters are calculated using (21) and (22). For classification applications, previous research have defined the ST-index i ST = θ 0 +βθ 1 , which is a linear regression between the two parameters with the slope parameter β (found by model fitting over large datasets), and shown that it can be used as a useful clinical parameter [38], [42]. Following (30), the CRLB for the ST-index is Since the SoI was assumed to be centered at t = 0, we can show that the first order polynomial bases functions are orthogonal, and therefore C(θ) is diagonal. Therefore, the CRLB of θ 0 and θ 1 can be studied independently and the CRLB of i ST simplifies to C(i ST ) = C(θ 0 ) + β 2 C(θ 1 ).
2) Real dataset: The PhysioNet Long-Term ST Database was used for the evaluation of the proposed experiment [43]. The dataset consists of 86 ECG records from 80 subjects with a variety of ST-segment variations. Each record is between 21 and 24 hours and contains two to three ECG signals, sampled at 250 Hz. One hundred thousand beats were randomly selected from lead II signals (as the more conventional lead in ST-segment analysis) from all subjects. The signal baselines were removed for further processing.
3) Synthetic dataset: For quantitative evaluations, in addition to the real dataset, the McSharry-Clifford sum of Gaussians (SoG) ECG model is used for synthetic ECG generation [11]. The model parameters were trained over the PTB Diagnostic ECG Database [44], by using tools from the opensource electrophysiological toolbox (OSET) [45]. Accordingly, the ECG beats of lead II were segmented, mapped to the cardiac phase domain through R-peak alignment [46], and the average beat was calculated from the aligned beats. The Rpeak were considered as the reference time and the initial values of the SoG parameters were estimated by fitting the model on the average beat by nonlinear least squares. 26 subjects of the PTB dataset were chosen to generate a diverse collection of synthetic ECG types. The number of Gaussians used for ECG modeling varied for each subject and was chosen to achieve a normalized mean squares error of bellow 0.02%.
Next, each of the 26 fitted models were used to generate synthetic two-minute ECGs, such that the center, amplitude, and width parameters of the Gaussians used for modeling each beat were chosen by random deviations around the initial fitted values per beat. The distribution of the deviated parameters was chosen to be Normal, with a mean value of the initial values (obtained from the model fitting process) and a 3% relative standard deviation (RSTD). This approach preserves the R-peak centers unchanged (since the R-peaks are assumed to be at t = 0 of the SOI), while the centers of the gaussians used for the P-and T-waves undergo more variations, resulting is realistic ECG. The RR-intervals of the synthetic ECG were also Normal distributed, with the same mean value as the corresponding PTB data used for training the model and RSTD of 4%. Fig. 4 shows a 10 s sample of a synthetic ECG with an SNR of 22 dB, generated by this method.

4) Simulation Scenario:
After removing the baseline, the ECG (real or synthetic) were segmented into beats, and for each beat the SoI was chosen as a 60 ms segment starting from the J-point, as illustrated in Fig. 5. For the real data, the J-points were obtained from the PhysioNet annotation file, and for the synthetic ECG it was fixed to 40 ms after the R-peak. Next, the linear model coefficients were estimated by a LS fit over the clean ST-segments (before adding noise at different SNR levels), using (16). The same procedure was repeated for all the beats, to obtain the prior distribution of the linear model parameters from the clean signals.
Next, the signals were contaminated by additive noise at different SNRs from -20 db to 40 dB in 5 dB steps. The LS model fitting procedure was repeated for each noisy beat, to obtain the ST-segment DC and slope parameters. As a clinical parameter of interest, we also calculated i ST numerically and compared it with the theoretical CRLB in (37), for β = 0.1, as proposed in [38].
The ML and Bayesian estimates were obtained from (16) and (18). Next, the covariance of error between the parameters obtained from the noiseless and noisy records were calculated. This procedure was repeated three times per each ECG beat, each time with a different ensemble of the additive noise.

5) Results:
The corresponding theoretical CRLB performance bounds of the ML and Bayesian frameworks were separately obtained from the derivations in Section III, for comparison. Fig. 6 demonstrates and compares the CRLBs vs the experimental variances of error, for the ST-level, ST-slope and ST-index for synthetic and real data, over different SNR. Accordingly, since the model is linear and the estimator is efficient, the ML estimator has reached the CRLB in all cases. However, for the real data parameters in low SNR, there is a gap between the experimental variance of error and the CRLB of the Bayesian scheme, which is due to the inaccurate prior distribution, as later discussed in Section VI.

C. QT-interval parameter estimation using Gaussian functions
The QT-interval is one of the most informative features of the ECG for coronary artery disease identification [47]. It is defined as the difference between the Q-wave onset and the T-wave offset, as shown in Fig. (7). If the ECG waves are modeled by Gaussian functions, the onset and offset of the Q and T-waves can be directly inferred by a transform from the model parameters [13], [48], [49]. 1) Theoretic CRLB: Suppose that θ Q = (a Q , b Q , c Q ) T and θ T = (a T , b T , c T ) T are the parameter vectors of the amplitudes, widths and centers of the Gaussians by which the Q-and T-waves are modeled. The "effective width" of the Q-and Twaves is proportional to the Gaussian model widths, i.e., βb Q and βb T , where β is a constant factor [13]. For instance, in the sequel, we follow [13] by selecting β = 3, which results in the onsets/offsets being set at the point where the amplitude of the Gaussian falls down to 98.9% from its maximum.
As shown in Fig. 7, the Q-wave onset (s Q ) and the Twave offset (e T ) can be expressed as linear transforms of the Gaussian model parameters: s Q = c Q − βb Q and e T = c T + βb T . The CRLBs of these parameters can be calculated from (30): Moreover, since by definition the QT-interval is d QT = e T −s Q , the CRLB of the QT-interval is found as follows where we have assumed that the parameters of the Gaussians fit over the Q-and T-waves are independent (a reasonable assumption due to their time gap).
2) Real dataset: The PhysioNet QT Database is used for this case study [50]. The dataset consists of over 100 fifteenminute two-lead ECG recordings, gathered from different databases with a wide variety of ECG morphologies, and sampled at 250 Hz. After baseline wander removal, 40,000 beats were randomly selected from all subjects for our study.
3) Synthetic dataset: The procedure for generating synthetic ECG for the QT-interval parameter estimation is exactly identical to the procedure detailed in Section V-B3.

4) Simulation Scenario:
The ECG (real and synthetic) were first segmented into beats by using the R-peaks as landmark. The SoI for the Q-wave was set to the interval [r − 80 ms, r − 20 ms], where r is the R-peak position in time. For the Twave, the SoI was set to [Tmax − 20 ms, r + 0.6RR], where RR is the beat-wise RR-interval, and Tmax = arg max t x(t); t ∈ [r+80 ms, r+0.6RR]. These ranges were adopted from related research on ECG beat segmentation [51]- [53].
The next step was to estimate the parameters from the clean ensembles. Hence, the SoG model was fitted on each of the SoI by applying the nonlinear LS in (4). The estimated parameters were used to calculate the prior distributions of the model parameters. The clean ensembles were next contaminated by White Gaussian noise with an SNR range between -20dB and 40dB, in 5dB steps. Having both the prior and posterior distributions, the theoretical Bayesian and ML CRLBs were calculated from (6).
The noise-contaminated signals were given to the nonlinear MAP and ML estimators detailed in Section II-C to estimate the SoG model parameters. The simulation was repeated three times per beat and per SNR, with different random additive noise to achieve more robust numeric results. Finally, the variances of error were compared with the theoretical CRLBs. 5) Results: Fig. 8 shows the theoretical CRLBs compared with the variances of error for the Gaussian model fit on the Q-and T-waves, for both the real and synthetic datasets, over different SNRs. Apparently, for the ML scheme, the Fisher Information I ML (θ) is calculated as a function of the parameters. Therefore, to make visual comparison of the differnt methods feasible, E θ {I ML (θ)} is plotted in the results. Also, since the CRLB and the covariance of error matrices are not necessarily diagonal for the sum of Gaussians model, we compare their determinants 1 . From Fig. 8, we can see that since the data-model is nonlinear in parameters and the estimators are not efficient, the covariance of error has not reached the CRLB. Nonetheless, the theoretical CRLB still gives us an idea about the quality of our estimator and how close we are to the lower bound of error (per SNR). The results are more discussed in Section VI.

VI. DISCUSSION
The CRLB depends on the data-model and data-models are subjective and a matter of debate. Therefore, the CRLB by itself does not make any judgment about the validity of a model. However, different measures exist for model assessment, including the MSE between real data and the model, the Akaike information criterion (AIC), minimum description length (MDL), and goodness of fit analysis on the estimation results (to assess the prior and posterior assumptions). This in turns makes the hereby derived CRLB a practically important means of evaluating any human or machine-based ECG annotation system.
In the context of estimation theory, the CRLB is one of the many lower bounds of estimation error, and does not necessarily guide us towards an efficient estimator. However, when the estimator is efficient, the CRLB is indeed a gauge for the estimation accuracy. Moreover, regardless of the estimator efficiency, the CRLB provides helpful information regarding the appropriateness of the estimation scheme. In the sequel, some of the important results, which can be derived for the ECG models studied in Section III, as the most popular ECG models in the literature, are reviewed.

1) Multi-beat parameter estimation:
For regular morphologies, increasing the number of observed segments by synchronous averaging across multiple beats improves the performance bound (by a factor K), as shown in (33) and (34). 2) SNR effect: As a parameter estimation problem it is trivially expected that the performance degrades in lower SNRs. This is confirmed in all the Fisher Information equations in Section III, where σ 2 e appears in the ML Fisher Information denominator. The SNR effect is also seen in the results in Figs. 6   the signal (in Hz). Although the three aspects are related, they do not have identical effects on the CRLB. Among these factors, following (21) and (26)- (29), the Fisher Information (CRLB) is linearly proportional to the sampling frequency (the sampling time). Therefore, as a rule of thumb, over-sampling above the Nyquist frequency generally improves the CRLB regardless of the other factors. The impact of factors (A) and (B) above, are less explicit. However, looking at the entries (2,2) and (3,3) of the Fisher Information and CRLB matrices of the SoG model in (26)-(29), we notice that the CRLB (Fisher Information) is proportional (inversely proportional) to the effective width of the guassian model. Therefore, the rule of thumb is that the estimation error variance of ECG wave centers and widths is smaller (larger) for narrower (wider) waves. As an example, supposing that one has an efficient estimation algorithm that reaches the CRLB, at identical sampling rates and with sufficiently large segments fitting a single gaussian function over the QRS is more accurate than fitting a gaussian over the T-wave. Moreover, since the SoI length does not appear in (26)-(29), we conclude that increasing the width of the SoI beyond the "effective width" of the parameter of interest does not improve the CRLB. 4) Model parameter priors: Comparing the ML and Bayesian frameworks, the term due to model parameter priors appears as an additive term in the Bayesian Fisher Information in (13). Since the Fisher Information is a positive definite matrix, we can conclude that accurate priors always decrease the CRLB (improve the performance bound). The results in Section V also approve this fact, where the Bayesian CRLB is always lower than the ML. However, the outperformance of the Bayesian vs ML estimation is only guaranteed for "accurate priors." For example, since there were a considerable number of abnormal beats in the QT database, the assumption of Gaussian distributed priors for the model parameters were not an accurate assumption in this case. Therefore, in Figs. 8(a) and 8(b) it is seen that the Bayesian performance is worse than ML in high SNRs, i.e., the priors have not helped to improve the estimation performance. However, for the synthetic data, where the parameters were synthetically generated by a normal distribution, the performance of the Bayesian and ML schemes are asymptotically identical in high SNRs (cf. Figs. 8(d) and 8(e)). A similar effect is seen for the real vs synthetic ST dataset shown in Fig. 6, where we can see that the Bayesian estimator is no longer efficient (does not reach the CRLB), when provided with inaccurate priors. 5) Linear-in-parameters models: For linear-in-parameter models, efficient estimators are guaranteed to exist [54]. The results of ST-segment polynomial models in Section V-B also approve this finding. Another feature of linear-in-parameter models is that their ML CRLB is independent from the parameter values. Therefore, the theoretical CRLB gives a general performance bound (regardless of the parameter value). However, for nonlinear-in-parameter models, the CRLB changes with the model parameters. Therefore, for example, polynomial fitting over the baseline and the ST-segment have identical properties and CRLB bounds, while the CRLB of Gaussian models fitted over the QRS and T-waves depend on both the amplitudes and widths of these waves.

VII. CONCLUSION
ECG parameter estimation, performed by human experts and machines and algorithms is always "model-based". Therefore, depending on the adopted model and the estimation scheme (LS, ML, or Bayesian), there are theoretical bounds on the achievable accuracy of any estimation algorithm. An important implication of this debate is that performance bounds such as the CRLB only depend on the adopted data-model and estimation scheme, and not the choice of the estimation algorithm itself.
In this research, the CRLB was derived for the most popular data-models used in the ECG literature (polynomial expansions and sum of Gaussians). The framework was validated over real and synthetic data for three practical applications: T/R ratio, ST-segment and QT-interval estimation, using stateof-the-art estimators in each context. The results of this study can guide researchers on the selection of the data-models, sampling frequency, modeling segment length, and the number of beats required for average ECG beat extraction.
The work can be extended from various aspects. From the modeling perspective, ECG models such as Legendre polynomials and Splines can be used [10], [37], which have interesting theoretical and practical properties. The derivation of the CRLB for alternative ECG models and their analytical/numeric comparison with the hereby presented results would be highly insightful.
Model priors' assessment is another direction for future studies. Specifically, the impact of modeling errors (in the data-model, priors or posteriors) is of significant importance. For instance, depending on the parameter of interest and the measurement noise type, distributions such as Rayleigh, Poisson, Gamma, Gaussian mixtures, etc., can be used for specific scenarios. The impact of accurate vs approximate prior/posterior assumptions on the theoretical bounds of ML and Bayesian frameworks can be studied in this context.
Finally, from the estimation perspective, the development of estimation algorithms, which reach the theoretical CRLB bounds would be of great practical importance.