Bootstrap Empirical Mode Decomposition with Application to CIELAB Color Images

Sifting-based empirical mode decomposition (EMD) lacks convergence proof, whereas one-shot convex optimization-based methods are dependent on the regularization function. This paper proposes an alternative optimization-based EMD based on the notions of: 1. local mean points that impose mode symmetry via a Tikhonov regularized least-square (RLS) problem, and 2. efﬁcient bootstrap sifting that guarantees asymptotic convergence of the mean envelope to the local mean points, regardless of regularization. Mathematical proof of convergence and a straightforward extension to the 2D-multivariate setting and CIELAB color image sare presented. Performance is demonstrated with a univariate signal and two images. Spectral analysis conﬁrms coordinated feature extraction among image components, and separation of spatial spectra among the intrinsic mode functions.


A. General Background
An important challenge in the analysis of images of the natural environment is the extraction of local features from background clutters. For example, SONAR images and remote-sensing images are cluttered by high spatial-frequency contents [1]- [3]. Most of these images are non-stationary, where local features are manifested by spatial relations among neighboring pixels or local regions. Usual methods for spectral analysis that are global in space and time lead to poor performance [4].
One school of thought that mimicks localized attention in human perception, called visual saliency [5], [6], is to analyze a pixel's relation with its surroundings in terms of color, orientation, contrast, curvature, etc. The resulting saliency map is a coding of the visual field where maxima of the saliency function indicate local points of attention [1], [7]- [9]. Shaw et al. [10] argued that cluttered background and ill-defined object boundaries could lead to fuzzy saliency maps, and proposed a multiscale saliency detection method based on 2D empirical mode decomposition (EMD) and Hilbert spectral analysis.
Originally, Huang et al. [11] introduced EMD to analyze 1D, univariate non-stationary signals. Its central concept is that a signal is composed of spectrally disjoint components called intrinsic mode functions (IMF) that are symmetric oscillations about zero, with possible amplitude and frequency modulations (AM-FM). The goal of EMD is successive extraction of each IMF from the signal, beginning with the highest spectral content, to leave a monotonic residue that reflects global features or trends. Processing the IMFs by Hilbert Transform yields an estimate of signal amplitudes (or powers) in the timefrequency domain. The combined procedure of EMD and HT is known as Hilbert-Huang Transform (HHT).
Nunes et al. [25] developed bidimensional EMD (BEMD) for univariate signals on a 2D support. This opened the door to both the application and further development of EMD and HHT for 2D data, notably monochrome images. The problem of local-feature extraction is thereby treated as the separation of non-stationary images into spatial frequency-disjoint components with localized amplitudes [26]. Joint space and spatialfrequency representation and AM-FM analysis of images have also received much attention. For example, HHT was applied to textual analysis [27] and image stabilization [28]. In [29], IMFs of ultrasound images were colorized and reconstituted to aid human interpretation. An EMD-like algorithm was proposed in [30] to obtain the skeletons of objects. In [10], BEMD and Hilbert spectral analysis was applied to obtain multiscale saliency maps, from which a final aggregate map was derived. It has been argued that, since the EMD technique is fully data-driven without any pre-defined filter or wavelet functions, it is more adaptive in nature than filtering schemes [10], [25].

B. Mean Envelope and Sifting Process
Algorithm 1 is the basic structure of classical EMD. A key element is the sifting process which, at each level of EMD, iteratively determines the mean trend of the signal and subtracts it from the latter to yield a proto IMFz k , which becomes the subject of the next sifting iteration. Sifting ends when the trend has sufficiently flattened, implying that what remains is the desired IMF. Computation of the mean in each sifting iteration is realized by first determining the local extrema, which are then spline-interpolated to produce the upper and lower envelopes. The mean is the average of these two envelopes.
Further research in BEMD has focused on two closely related problems. Firstly, constructing the upper and lower envelopes by interpolation is empirical and computationally Algorithm 1: Classical EMD Initialize: z 1 in = original signal. for l = 1 to M do begin level-l EMD: begin sifting: Initialize: k = 0,z 0 = z l in , z l res = 0; while ẑ k > ε or k ≤ k max do Determine local extrema of z k ; Compute spline envelopesẑ k max andẑ k min of local maxima and minima; Compute mean signalẑ k = 1 2 (ẑ k max +ẑ k min ); Compute next proto IMF (sift out mean) z k+1 =z k −ẑ k ; Accumulate mean into residue: z l res +=ẑ k ; k = k + 1. end end Level-l output: Residue: z l res . IMF: onerous; the choice of spline functions is crucial in characterizing the underlying oscillations of the signal [16]. A common problem is the appearance of artifacts due to overshoots and undershoots. To resolve this problem, Delechelle et al. [31] estimated the mean envelope using parabolic partial differential equations. Damerval et al. [32] introduced a fast algorithm for BEMD based on respective Delaunay triangulation (DT) of the local extrema, and piecewise cubic polynomial interpolation to obtain the min and max envelopes. Unlike the latter, Xu et al. [33] proposed another DT-based approach that directly estimates the mean envelope with piecewise linear basis functions having the triangles as support, followed by bi-cubic interpolation of the latter.
All the references cited so far are sifting-based. A second, much studied problem is that sifting has no theoretical convergence guarantees, its outcome depending highly on the spline functions employed. From there emerge optimization-based approaches that eliminate sifting altogether. Meignen et al. [34] and Oberlin et al. [35] replaced sifting by constrained convex optimization to obtain the mean envelope. Huang et al. [36] also formulated a convex optimization problem but for the upper and lower envelopes.
Another direction is regularized convex optimization that penalizes asymmetry of the IMF at extremal locations. Thus, Pustelnik et al. [37] formulated a 1D-EMD optimization problem with non-differentiable convex penalty functions and orthogonality constraints, solved with a proximal algorithm. This was extended to 2D EMD by Schmitt et al. [4] as a primal-dual splitting algorithm [38]. Colominas et al. formulated 1D [39] and 2D [40] quadratic penalty functions, leading to least-squares minimization problems with Tikhonov regularization for smoothness solvable in closed-form.
Instead of the mean envelope, Chen et al. [41] proposed a similar optimization-based method to estimate the extremal envelopes in which regularization is derived from edgepreserving bilateral filtering. Curiously, this work still employs sifting.

C. Multivariate EMD
Extension of EMD to multivariate signals, whether on 1D or 2D support, requires rethinking. The notions of extrema, upper and lower envelopes are less well defined than in the univariate case. On the one hand, various works have decomposed color or otherwise multi-channel images by treating each channel as a separate 2D-univariate signal, with or without recombination of the multiple final results [42]- [45].
On the other hand, true multivariate EMD (MEMD) approaches have emerged based on projection onto direction vectors sampled on the unit hypersphere in the signal's range space. For each direction vector, the projection is a univariate signal whose extrema correspond to the vectorial signal's rotations. In the 1D-multivariate setting, Rilling et al. [46] thus obtained the multivariate envelopes by component-wise interpolation of these extrema. Rehman et al. [47] extended MEMD to 2D by stacking the pixels into a 1D column. Xia et al. [48] proposed a bidimensional MEMD for multi-focus image sets by direct projection and envelope computation on a 2D support. These methods employ sifting.
A non-projection method by Fleureau et al. [49] redefines characteristic points of 1D-multivariate signals as the local minima of the squared 2-norm of signal velocity. The so-call turning tangents correspond to direction changes. The mean signal is obtained by interpolating the barycenters of signal elements between characteristic points. This method also relies on sifting.
It is worth mentioning that the previously cited approach of Oberlin et al. [35] is multivariate. With constraints on the barycenters of extrema, it effectively imposes symmetry in multi-dimension. The extrema are, however, computed component-wise.

D. About this Contribution
This work is motivated by two observations. Firstly, the regularized convex optimization approaches mentioned above are nominally one-shot in the sense that mode decomposition stops at optimization without sifting, even though a few sifting iterations were still performed for the examples in [39]. Without sifting, the resulting modes depend on the choice of regularization parameters. Large values mean that the intended IMF symmetry is not exactly enforced, whereas the convergence of sifting the optimal solutions remains unproven.
Secondly, despite the development of MEMD, available works on color images treat the color components separately. For example, in [42] the recombination of the IMFs of RGB components produced what appeared to be phantom colors, probably due to the lack of coordination among the components. This paper presents an alternative optimization-based EMD approach based on a sifting-like iteration with convergence guarantee. The fundamental formulation is given for 1Dunivariate signals, which is readily extended to the 2Dmultivariate case. Specifically, it is intended for images in the CIELAB color space. Three notions are central to the proposed method: local mean points, regularized least-squares (RLS) optimization, and bootstrap sifting.
Firstly, for 1D-univariate signals, a local mean point is defined as the barycenter, in terms of both location and intensity, between a local minimum (resp. maximum) and its two neighboring maxima (resp. minima). The barycentric coordinates are bilateral weights that take into account both geometric and photometric relations. In the 2D multivariate case, extrema are zero-crossing nodes of all the component velocities. Two DTs are constructed for the positive and negative zero-crossing nodes irrespective of components. Since any point has a unique enclosing triangle in either DT, a local mean point associated with a positive (resp. negative) node is defined as the barycenter between it and its enclosing triangle of negative (resp. positive) nodes.
Secondly, a mean envelope that passes exactly through the local mean points implies IMF symmetry about these points. Thus, we penalize the squared distances of local mean points to the mean envelope, as well as gradient smoothness of the latter. This yields to a Tikhonov-regularized least-squares problem with a closed-form solution for the mean envelope. This symmetry penalty includes [4], [37], [39], [40] as a special case.
Thirdly, we formulate a theoretically convergent and computationally efficient sifting-like iteration on the optimal solution. It does not recalculate the extrema nor the local mean points during iteration. Instead, it "bootstraps" from the local mean points that are calculated once from the input signal and iterates the optimal solution on the fit errors. We prove that the process is a contraction, which means that the mean envelope asymptotically approaches the local mean points despite regularization. The approach is named bootstrap EMD after the nature of this sifting process.
The proposed formulation does not require the signal intensity to belong to an ordered set, thus allowing a formal extension to the multivariate case. However, this extension is meaningful only for signals with a Euclidean range space. Firstly, the barycenter is a weighted arithmetic mean that is defined on a Euclidean space as the element that minimizes the following weighted sum of squares, where · denotes the Euclidean distance: Secondly, the bilateral weights and penalty functions are also defined in the Euclidean norm. Thirdly, the RLS problem has a norm-minimal closed-form solution. Now, in the CIELAB color space, the Euclidean distance is a measure of photometric similarity [50]. Therefore, we choose to apply bootstrap EMD to images in the CIELAB color spaces. Numerical results show that bootstrap EMD can decompose a synthesized, non-stationary univariate signal into its FM, AM, and random-phase modulated (PM) constituents with an accuracy comparable to the classical EMD. In contrast, the one-shot method in [39] does not extract the PM component as itself. When applied to color images, the proposed approach produces color-coherent IMFs and residues that separate local features by distinct spatial spectra. In an example where fine local features have different colors than the background, color spectrum shifts in the residues also demonstrate the effectiveness of bootstrap EMD.

E. Organization
Section II presents bootstrap EMD for 1D-univariate signals and its convergence proof. Section III details an extension to the 2D-multivariate setting, specifically CIELAB color images. Section IV compares the proposed approach with closely related references. Demonstrations of the proposed approach are given in Section V. A conclusion and remarks for further work are given in Section VI.

II. BOOTSTRAP EMD A. Local Mean Points
We consider a discrete-time signal which takes real values z t at instances t on a finite time interval. For simplicity of presentation, it is assumed that the time instances are equally spaced, thus t ∈ D = {1, . . . , n} without loss of generality. We denote by z ∈ R n the entire sequence.
Suppose that there exist p local extrema at time instances {t 1 , . . . , t p }. Let L {1, ..., p}. Since the signal is univariate, these extrema are alternately maximal and minimal. Assume that, for l ∈ L\{1, p}, the l th extremum is a local minimum (resp. maximum). It is bordered by the (l − 1) th and (l + 1) th extrema which are local maximum (resp. minimum). Since they are of opposite types, it is reasonable to define a local mean point (t l ,z l ) associated with the l th extremum by the following weighted means of location and intensity: where (w l−1 , w l , w l+1 ) are some chosen weights so that w l−1 +w l +w l+1 = 1. For the first (resp. last) extremum, i.e., l = 1 (resp. l = p), we apply (1)-(2) with a reflection of the 2 nd (resp. (p − 1) th ) extremum about (t 1 , z t1 ) (resp. (t p , z tp )): Thus, there are p local mean points (t l ,z l ), l ∈ L. From (1)-(2), it can be seen that (t l ,z l ) is the barycenter of the three local extrema (t i , z ti ), i ∈ {l − 1, l, l + 1}, with weights (w l−1 , w l , w l+1 ). In particular, if these equal is a point at mid-height of the triangle formed by the local extrema, and situated along the bisector through the vertex (t l , z t l ). The local mean point is therefore a point of symmetry between the triangle's base and its vertex. An illustration can be found in Fig. 1.
The reason for a more general definition of the weights will become clear in Section III-B, where this notion is applied to image intensity.

B. Mean Envelope via Regularized Least-Squares Minimization
Requiring the mean envelope to pass through the local mean points amounts to imposing symmetry on the IMF. In the following, we construct the mean envelopẑ by minimizing a penalty function of the distances of the (t l ,z l )'s toẑ.
For each l ∈ L, let i l ∈ D be the nearest time instant inferior or equal tot l , i.e., i l ≤t l < i l + 1. Define the following error betweenz l and the linear interpolant ofẑ i l andẑ i l +1 att l : or, introducing notations for the coefficients, Collecting all instances of (5) for l ∈ L yields the following linear regression: where ε ∈ R p ,z ∈ R p andẑ ∈ R n are columns formed with ε l ,z l andẑ t . The regressor A ∈ R p×n is a sparse matrix of which the only two nonzero elements in the l th row are a l,i l and a l,i l +1 . Moreover, from (5) in can be seen that each row of A sums to 1. The error quantity ε measures how closely the mean envelope passes by the local mean points. Thus the objective is to solve the least-squares problem min ε 2 . It can be shown that A ∈ R p×n has full row rank. However, the above problem is ill-posed because there are generally more sampling instances than local mean points, i.e., n > p. To proceed, we impose the following gradient constraints onẑ: Collectively, these constraints can be written as where T is a sparse matrix of the form Combining linear regression (7) and constraint (11), the mean envelope is obtained by solving the following regularized least-squares (RLS) problem with second-order Tikhonov regularization [51]: LEMMA 1. The RLS problem (13) has a closed-form solution given by:ẑ Proof. Consider the following augmentation: It can be shown by induction that the Tikhonov matrix T has rank (n − 1) with a one-dimensional null space engendered by the vector 1 ∈ R n with 1 in all its elements. Moreover, 1 cannot be in the null space of A since the contrary would imply that every row of A sums to zero, which is absurd. Consequently, A a has full column rank, and its Moore-Penrose pseudoinverse A + a is given by Moreover, (16) has a closed-form solution given byẑ = A + az . Substituting (15) and (17) into the latter equation yields (14).
By (14), the proposed approach provides a mean envelopê z as the solution of an RLS minimization problem. This can be compared to the convex optimization methods in [4], [39], [40]. Nevertheless, if mode decomposition stops at optimization, we are faced with the same problem as the cited references, namely, the symmetry of the IMF and how well the residue captures the global trend depends heavily on the choice of λ. In the next section, we propose a sifting-like approach to overcome this problem.

C. Bootstrap Sifting
The classical sifting process would repeat the previous calculations on the proto IMFz = z −ẑ, yielding a new mean envelope each time. Instead, we propose an iterative operation on the calculated quantitiesz andẑ without further processing the extrema of intermediate signals. Effectively, the process "bootstraps" from the initial local mean points.
Using notations similar to classical sifting (Algorithm 1), we shall denote byz k andẑ k the results at the k th iteration of bootstrap sifting. After N iterations, the final mean envelope z env and IMF z imf shall be determined as The objective is to guarantee convergence of z env by some measure. First, initializez 0 as the local mean points according to (1)-(2). This calculation is performed only once. Thus, A remains unchanged throughout the iterations; recall its definition by (5)-(6). Next, set z 0 env = 0 and, for k = 0, 1, 2, . . . , iterate as follows:ẑ Essentially,z k+1 is the fit error in the k th iteration, and (19) applies the closed-form solution to successive fit errors. Note that, substituting (19) into (20), it can be found thatz k undergoes the following update process: Proof. Consider again the augmentation (15). Since P a A a A + a is the orthogonal projector of R p+n onto the range of A a , itself a subspace of R p+n , the projection P aza and its orthogonal complement (I − P a )z a satisfy the following 2-norm inequalities: Moreover, z a = z . From (17), one obtains Then, by an abuse of the notation I to signify identity matrices of appropriate sizes, Consider the upper component. Noting that P r has full column rank, and using (23), one obtains which shows that (I − P ) is a contraction.
Lemma 2 guarantees that z k → 0. It follows from (19) that ẑ k → 0 also. Moreover, by the k th iteration, how closely the mean envelop z k env passes by the local mean points z 0 is measured by z 0 − Az k env . Now, Hence, one has the following result.
THEOREM 3. z 0 − Az k env = z k → 0 as the number k of iterations increases. In other words, the mean envelope passes asymptotically close to the local mean points.
Theorem 3 implies that the IMF is asymptotically symmetric. Moreover, z k is a well-defined stop criterion. It is worth noting that convergence is guaranteed independently of the regularization parameter λ, although the latter affects the convergence rate.

D. Bootstrap EMD Algorithm
The above sections describe the steps for one level of mode decomposition. Repeating these steps produces the proposed bootstrap EMD algorithm, as outlined in Algorithm 2. Bootstrap sifting can be terminated by setting a threshold ε on z k or a maximum number of iteration.
Determine local extrema of z. Calculate local mean points (t,z) according to (1)-(2). Calculate A usingt according to (5)- (7): Level-l output: Residue: z l res = z k env . IMF: Compared to the classical EMD (Algorith 1), it is clear that extrema and local mean-points processing only once before entering the bootstrap sifting loops. While the mean envelope has the same meaning as classical sifting, there is no notion of proto IMF in bootstrap sifting.

III. 2D-MULTIVARIATE BOOTSTRAP EMD AND CIELAB IMAGE DECOMPOSITION
We now propose a 2D-multivariate extension of bootstrap EMD to color images in the CIELAB color space, which is identifiable with the 3-dimensional real space Let N = m × n denote the total number of pixels.
Since Z Lab is not an ordered set, the meaning of upper and lower envelopes becomes problematic. In contrast, the concept of local mean points is applicable even in the multivariate setting. Their locations will be defined via triangulation of the 2D support D.
In the following, bold lower-case letters are employed to indicate vector quantities.

A. Delaunay Triangulation from Local Extrema
Denoting by δ x z and δ y z its row-and column-wise finitedifference operators, we consider that component L has a 2D local maximum (resp. minimum) at (i, j) if both δ x L and δ y L cross zero at (i, j) in the negative (resp. positive) sense: We also include 1D extrema along the image boundaries to prevent distortions when there are few 2D extrema near the boundaries. Similarly for the a and b components. We construct a Delaunay triangulation of the image from the set of all local maxima, making no distinction among the different components. Similarly, a second Delaunay triangulation is constructed from the set of all minima. Note that these two triangulations are not upper nor lower envelopes since we confound the extrema of all components. Instead, they provide a basis for the computation of local mean points.

B. Local Mean Point
The local mean point is computed in a triangulated neighborhood of an extremum. Consider, for each local minimum, the unique triangle with local maxima as vertices that encloses it. Assuming that the extrema are numbered in some manner, let v l ∈ D be the image coordinates of the enclosed minimum, l being its index number. Meanwhile, denote by v li ∈ D (i = 1, . . . , 3) the coordinates of the three vertices of this triangle, and by b l the coordinates of the triangle's barycenter. In other words, b l = 1 An example is shown in Fig. 2. In addition, let z l and z li (i = 1, 2, 3) be the intensities of the enclosed local minimum and the vertices, and z b l the intensity of the barycenter. More precisely, z b l is the pixel intensity on which b l is located.
Bilateral weights Consider the following Gaussian photometric range function between the barycenter and each of the four vertices: where · denotes the Euclidean distance in the CIELAB color space, and σ is a spread parameter. Next, introduce the weights which we further normalize to give a sum of 1: Now, we define the l th local mean point (v l ,z l ) by the following weighted mean location and intensity: It can be seen that (30) is a multivariate extension of (1) and (2). Its formulation is valid even though the extrema may be those of different components, as it does not rely on an ordering of the extrema. The local mean point is obviously a barycentric mean of the four vertices with thew j 's as weights.
In the special case where σ = ∞ or, equivalently, the photometric ranges r j are all set to 1, then the local mean point (v l ,z l ) is a geometric midpoint of the vertices. This extends the notion of symmetry seen in the 1D-univariate case.
However, in the general case, (v l ,z l ) is determined by weights that contain both geometric and photometric information. The idea is inspired by bilateral Gaussian filtering [50], whereas the goal here is to allow the central intensity of the triangle to influence the local mean point. Conceptually, instead of a strictly symmetric IMF, we desire a residue that reflects photometric trends in the midst of extrema.
The above describes the computation of local mean points based on local minima enclosed in a maxima-associated triangulation. An analogous computation can be performed by role reversal between maxima and minima, i.e., for each maximum enclosed in a minima-associated triangulation, as shown in Fig. 3. Performing both computations generates a denser distribution of local mean points.  Fig. 2: triangle (blue) with vertices at local minima (solid blue inverted-triangular markers) that encloses a local maximum (solid red triangular marker, which is v l 2 in Fig. 2). Barycenter (blue circular marker) and local mean (magenta diamond marker).

Remark 1.
Note also that the same triangle may enclose more than one extremum, as indicated by the hollow markers in Fig. 2 and Fig. 3. Nevertheless, this does not alter the nature of the computation since it is performed for each extremum. Thus, some triangles may be used multiple times, but each time yielding a different local mean point.

C. Mean Envelope via RLS and Bootstrap Sifting
We now compute a mean envelopeẑ by a 2D-multivariate extension of the RLS minimization problem in Section II-B. The local mean points are non-uniformly distributed on the support D. We require them to be linear interpolants of the mean envelope.
A given local mean point (v l ,z l ) is located in a square formed by four neighboring pixels at coordinates (i l , j l ), (i l + 1, j l ), (i l + 1, j l + 1) and (i l , j l + 1), as shown in Fig. 4. The square is divided into upper-right and lower-left triangles. (Alternatively, one may adopt a division into upper-left and lower-right triangles.) The local mean point's location relative to the top-left pixel is given by (x l ,ỹ l ) =v l − (i l , j l ). The interpolation error depends on which trianglev l falls on [52]: upper right (x l >ỹ l ): lower left (x l <ỹ l ): Let p be the total number of local mean points. Then, (31) and (32) yield p 3-dimensional linear regression equations in matrix notation: where ε ∈ R p×3 ,z ∈ R p×3 and, by a slight abuse of notation, z ∈ R N ×3 is obtained by stacking theẑ i,j 's column-wise with the CIELAB components arranged row-wise. A is a p × N sparse matrix of which each row comprises the coefficients of (31) and (32) as the only three non-zero elements. As in the univariate case, each row of A sums to 1, and A has full row rank. Here, gradient regularization is imposed for each row and column of the support D. This yields the following sets of constraint equations: For each j th row, j ∈ {1, ..., m}: For each i th column, i ∈ {1, ..., n}: Collectively, the two sets of constraints can be written as Since theẑ i,j 's are stacked column-wise intoẑ, T c is an N × N block-diagonal sparse matrix with diagonal blocks equal to T given by (12). T r is also an N × N sparse matrices consisting of a bijective rearrangement of the elements of T c . Hence, both matrices have rank N − 1.
The mean envelope is then obtained by solving the following RLS problem with two Tikhonov regularization terms: By an argument analogous to Lemma 1, specifically, by an augmentation of A by √ λ r T r and √ λ c T c , the RLS (35) has a closed-form solution: Based on the above, bootstrap sifting takes the form It can be shown by a similar argument as Lemma 2 and Theorem 3 that bootstrap sifting here is also convergent, i.e., the mean envelope asymptotically approaches the local mean points.

D. Decomposition Algorithm with Perturbation
Algorithm 3: CIELAB Bootstrap EMD Initialize: z 1 in = orignal CIELAB image. Set up T r and T c . for l = 1 to M do Generate random noise n. z = z l in + n. begin Level-l bootstrap EMD: Determine local extrema of z: (24)- (26). Construct Delaunay triangulations from local maxima and minima, respectively.

Compute local mean locations and intensities
(v,z) according to (30). Compute A usingv according to (31)- (33). begin bootstrap sifting: In summary, bootstrap EMD of CIELAB images is outlined in Algorithm 3, which has the same basic structure as Algorithm 2. The Frobenius norm ẑ F is used in the stop criterion.
Additive noise Notice that random noise is added to the input image at the beginning of decomposition. The reason is that the residual image smoothens after every level, which means that when it is used as input to the next level, extrema become sparser whereas the associated triangulations become coarser. This leads to geometric and photometric distortions, producing out-of-shape features and phantom colors (see example in Section V).
To avoid these artifacts, a non-stationary noise at randomly sampled locations are added. It is generated according to the following formula: where s i,j ∼ N (0, σ), and G i,j ∼ U(0, 1). Image pixels are therefore perturbed at an average spatial frequency of (1 − γ). For a sufficiently large γ, these perturbations are sparsely located and do not corrupt the residues; however, they remain in the IMFs. Noiseless IMFs can be recovered as the difference between the residues of successive levels: The above is reminiscent of but not exactly white noiseassisted ensembled EMD [22], since the added noise is not white and nor regularly sampled, and EMD is not repeated to create an ensemble.

IV. COMPARISON WITH RELATED WORKS
This section discusses similarities and differences between the proposed approach and closely related references already cited in the Introduction. For easier comparison, relevant expressions in those references are reproduced below using notations of the present work.
Among the earlier cited references that replaced sifting with an optimization-based determination of the mean envelope, references [4], [35], [37], [39], [40] employed the notions of symmetry or barycentric mean, although they deferred from the present approach in the definition of the latter.
In the 1D-univariate setting, Colominas et al. [39] proposed quadratic minimization of an error quantity of the form (in the notations of and Algorithm 1 and Section II-A) wherez = z −ẑ was the IMF, and the weights were the barycentric coordinates of t l relative to its two neighbors such that w l−1 + w l+1 = 1, and t l = w l−1 t l−1 + w l+1 t l+1 . It can then be deduced that Comparing with (1)-(2), the above (40)-(41) defined a mean point (t l , ε l /2) of the IMF at the location of the l th local extremum, although the authors did not explicitly state so. Minimization amounted to requiring the IMF to be zero at such mean points. Unlike the proposed method which imposes symmetry about the local mean-point locationst l , [39] imposed it about the extremum locations t l .
Colominas et al. [40] was a 2D-univariate extension of [39]. As in the 1D case, the l th IMF mean point was colocated with the l th local extremum, i.e. (v l , z l ) in the notations of Section III-B, whereas its intensity ε l = jw j z j as in (30), withw l = 1 and (w l1 ,w l2 ,w l3 ) being the barycentric coordinates of the local extremum relative to the three closest extrema of the opposite type. Hence, reference [40] and the proposed approach share the same notion of enclosing triangle, although [40] did not employ Delaunay triangulation. Moreover, it was restricted to the univariate case.
In both [39] and [40], the issue of interpolation in the sense of (5) did not arise because the IMF mean points were defined at sample or pixel locations. Thus, the RLS problem in both 1D and 2D had the form Here A was different from the present work but arose from the coefficients of (40) and its 2D extension. As in the present work, second-order Tikhonov regularization was considered. The above notion of symmetry and, therefore, the implicit definition of IMF mean points were also central to references [4], [35], [37]. Pustelnik et al. [37] was a 1D-univariate work that, instead of a penalty function, treated the symmetry conditions ε l = 0 as hard constraints on the set of admissible mean envelopes.
The above was extended to the 2D-univariate case in Schmitt et al. [4]. Instead of hard constraints, the 1-norm of the symmetry conditions served as the regularization function, resulting in a non-smooth convex optimization problem.
Oberlin et al. [35] was a 1D-multivariate approach that relaxed the constraints ε l = 0 by an inequality, whereas the minimized quadratic quantity was the mean-envelope magnitude i ẑ i 2 . The admissible set of k-variate envelopes was defined as a collection of B-splines of order k. The difficulty lay in the determination of the knots. Although the authors of the above references argued for ε l = 0 as expressing w l−1 (t l−1 ,z t l−1 ) + w l+1 (t l+1 ,z t l+1 ) to be the (vertical) mirror point of (t l , z l ), an argument which presumed a 1D-univariate setting, it is evident from our discussion in Sections II-III that even (40)- (41) can be extended to the 2D-multivariate setting. Moreover, since (42) has a closed-form solution, following arguments similar to the proof of Lemma 2, bootstrap sifting can also be applied to this formulation.
Xu et al. [33] is worth mentioning as a 2D-univariate, non-optimization approach that determined the mean envelope by DT and B-spline smoothing. A single DT of the plane was formed by all characteristic points, i.e., maxima, minima and saddle points. With the signal values at the vertices as heights, the triangular mesh was a piecewise linear surface in 3D space. Then, roughly speaking, the mean envelope was determined by interpolating certain interior points of the triangles and by further smoothing. Those interior points, called "smoothed data", were somewhat analogous to local mean points. However, their relation with IMF symmetry is not explicit. This approached required sifting in the classical sense.

V. EXAMPLES OF APPLICATION A. Bootstrap EMD of 1D-Univariate Signal
We consider a non-stationary signal z composed of three components: an FM signal with high carrier frequency modulated by a low-frequency sinusoid, an AM signal with medium carrier frequency modulated by a low-frequency triangular wave, and a PM signal with low carrier frequency modulated in phase by a random-walk process: where the FM parameters are a FM =3.5, f FM =1/8 Hz, a 1 =1/640 Hz, f 1 =1/187 Hz; the AM parameters are f AM =1/29 Hz, and s 2 (t) is a triangular wave of amplitude 4 and frequency 1/333 Hz; and the PM parameters are a PM =5, f PM =1/300 Hz, and s 3 (t) is a random-walk process with a mean update frequency of 1/4 Hz and mean step size of 0.2 radians. The signal is sampled at 1 Hz over 1000 seconds. The composite signal and its components are shown in Fig. 5(a). The barycentric weights of local mean points are ( 1 4 , 1 2 , 1 4 ). In this example, the performance of bootstrap EMD is compared with MATLAB's implementation of the classical EMD, and the present author's implementation of the unconstrained optimization approach (UOAEMD) strictly according to [39]. The regularization parameters of bootstrap EMD and UOAEMD are both set to 1. No sifting is performed in UOAEMD.
As shown in Fig. 5(b), bootstrap EMD successfully separates the FM, AM and PM components into the first, second and third IMF, respectively, leaving a practically flat residue. There is some mode mixing between the FM and AM modes, especially around t = 700. The third IMF captures the lowfrequency trend of the PM mode without the jumps due to the random-walk phase. These high-frequency contents are also mixed into IMF 1 . Fig. 5(c) shows that the classical EMD accurately separates the FM and PM modes into the first two IMFs with minimal mode mixing. IMF 3 also captures the low-frequency trend of the PM mode, but with a significant distortion around t = 500. Part of the trend is found in the residue.
UOAEMD separates the FM mode with minimal mode mixing, as shown in Fig. 5(d). There is a slight rounding of the triangular modulation of the AM mode. However, the PM mode is decomposed into the next three IMFs.
The three approaches have displayed comparable overall performances, each with its strength. Bootstrap EMD and classical EMD separate the non-stationary components per se, with the proposed approach showing higher overall fidelity. One might argue that UOAEMD yields a finer decomposition of the PM component. However, strictly speaking, the PM mode is nonlinear and non-stationary. Its key features are not recognizable after decomposition into three IMFs. Bootstrap EMD is applied to a 400×400 portrait of a female person (Fig. 6). This is a well-lit, high-contrast image with a full-color spectrum. The original RGB image is first converted to CIELAB colors, the L-, a-and b-components of which are shown in the first row of Fig. 7.

B. Bootstrap EMD of CIELAB Image: Female Portrait
The following rows of Fig. 7 display the components of the IMFs obtained after 1, 2, 3 and 6 levels of decomposition. Since the components are real numbers, each has been scaled to [0, 1] for visualization as a grayscale image. It can be seen that as the level increases, the IMFs progress from fine to coarse features. Moreover, this progression is coordinated among the three components.
Another way of visualizing the IMFs, presented in Fig. 8, is to cumulate the multivariate IMFs from level 1 upward: k l=1 z l imf . Therefore, the cumulative IMF is the complement of the residue, the sum of the pair being the original image. The cumulative multivariate IMFs (and IMFs) are not true images in the CIELAB color space because the L-components can be negative due to EMD operations. To aid visualization, the L-components of the cumulative IMFs have been replaced by their absolute values, then normalized to [0, 100] for brightness. As shown in Fig. 8 for decomposition levels 1, 3 and 6, dark regions of the cumulative IMFs are where the residues retain most of the original lightness, whereas bright regions indicate local features with significant lightness differences, either positive or negative.
The extraction of color features can be better visualized in the hue channels of the residues, which are true images in CIELAB colors and can thus be converted to the HSV color space. Fig. 9 shows the hue channels of the original image and the residues at 1, 3 and 6 levels. It can be seen that isolated, high spatial-frequency color features are progressively extracted from the residues, leaving more homogeneous hues. When viewed together with Fig. 8, it can be observed that many of the fine features are blue or cyan in color, for example, in the hair, eyes and ridge of the nose.
The above observations are further reinforced in Fig. 10 showing a close-up at EMD level 9. Comparison with component-wise classical BEMD One can remark the crispness of extracted features and the color coherence in both the cumulative IMFs and residues of the proposed approach. This is due to the multivariate calculation of the mean envelope. More precisely, each Delaunay triangulation is constructed with extrema of all components, and the photometric range function (27) is multivariable.
In contrast, the decomposition of individual components can lead to incoherent results. Fig. 11 shows the outcome of applying classical univariate BEMD to the test image in the RGB and CIELAB color spaces. Each component is processed separately. Then, the component-wise IMFs and residues are recombined in the respective color space. The BEMD implementation is based on [53]. For a fair comparison, additive noise has been included, and the stop criterion for classical sifting is set so that the residual errors are comparable with bootstrap EMD. One can observe phantom colors in the residues and fuzzy features in the cumulative IMFs. In addition, color feature extraction is less effective, as evidenced by less homogeneous hues than Fig. 9 and Fig. 10. This reinforces the need for multivariate EMD for color images.
C. Bootstrap EMD of CIELAB Image: Satellite Image Fig. 12 presents bootstrap EMD of a 400 × 400 satellite image of a landslide. In the top row, the original image is evenly exposed, but has a narrower color spectrum than the previous example. A majority of the colors are earth and vegetation tones in large, contiguous regions. Small details corresponding to artificial features are mostly grey or white, as seen in the full-color image on the left. The hue channel on the right shows that they, in fact, have blue or cyan hues.
Interest in the decomposition of this image may lie less in the IMFs or the fine local features, but more in the global patterns remaining in the residues. In effect, the following rows of Fig. 12 show that, as bootstrap EMD progresses, the residual hue channel is rid of high spatial-frequency clutters while preserving the debris-flow pattern. This can be useful in applications where the goal is to recognize debris flows as standout global features of interest. An example is the machine-learning approach of Bui et al. [3]. In addition, Fig. 13 shows that, despite the apparent difference in spatial-frequency contents between the L-, a-and b-components of the original image, each IMF has components with compatible frequency contents. This shows that the proposed multivariate treatment leads to coordinated feature extraction across all components.
Effect of additive noise To illustrate the benefit of noise-assisted decomposition, we re-apply bootstrap EMD to the satellite image without adding noise at the start of each level. As shown in Fig. 14, phantom colors and shape distortion appear at level 6. The overall allure of the debris-flow pattern is also less defined (compare with level-6 residue in Fig. 12).

Color spectrum
In this section, we analyze the color spectral shift as a result of EMD. Fig. 15 shows histograms of the hue channels of the portrait and its level-6 residues by bootstrap EMD and classical BEMD. In addition, the spectrum obtained by bilateral Gaussian filtering [50] is also included for comparison. The histogram bars are rendered with the corresponding hue values. The following observations can be made: 1) Bootstrap EMD extracts colors in the blue-cyan range, thus causing a spectral shift in the residue toward the red-orange range. This corroborates our previous observation in Section V-B that fine features are in the blue-cyan range. Colors in the magenta range are also partially extracted. This corresponds to details in the hatband as shown in Fig. 9. As a result, this region of the residues is homogenized from a mixture of blue and magenta to mostly blue. 2) Classical BEMD also extracts colors in similar ranges, but to a lesser extent. This is consistent with blue speckles seen in the level-6 hue image, Fig. 11. 3) Overall, one can deduce from the above that multivariate bootstrap EMD, component-wise classical BEMD results in color spectral shifts in the residues that are similar to bilateral filtering. However, unlike EMD, bilateral filtering leads to information loss, and the result depends on the choice of two parameters, namely, the geometric and photometric spreads. Spatial spectrum We now consider the effect of bootstrap EMD on the spatial spectrum. Fig. 16 presents the spatial spectra of the individual CIELAB components in the IMFs and residues of the satellite image, at decomposition levels 1, 3 and 6. The spectra are histograms of estimated instantaneous spatial periods of individual components representing feature sizes. Estimation of the instantaneous frequencies is performed both horizontally and vertically, and the counts are aggregated.
It can be seen that bootstrap EMD causes a shift in the residues toward higher spatial periods, thus containing larger features and trends. This shift increases with decomposition level. Meanwhile, the IMF spectra are in the low periods, beginning with level-1 IMF centered at less than 6 pixels, and increasing with decomposition level. Moreover, spectral separation can be observed between IMFs of different decomposition levels.
It is worth noting that, despite the original components (black lines) having significant spectral offsets, the spectra of the residues and IMFs are aligned across all components. This corroborates our claim that the proposed multivariate approach results in coordinated feature extraction among the components.

VI. CONCLUSION
A new optimization-based EMD approach has been proposed, of which 1D-univariate and 2D-multivariate formulations have been presented. The approach is based on three elements: the notion of barycentric local mean points to impose IMF symmetry, Tikhonov regularized least-squares optimization, and a novel bootstrap sifting process. Unlike classical sifting, bootstrap sifting is computationally efficient as it requires no repeated calculation of extrema, and reuses the RLS closed-form solution. Mathematical proofs of its convergence have been given. We submit that this approach is an improvement on two fronts: over classical EMD and BEMD in providing convergent sifting and a valid stop criterion, and over existing one-shot optimization-based methods in guaranteeing an asymptotic mean envelope independent of regularization parameters. In addition, the proposed multivariate formulation is an improved treatment of color images over existing component-wise approaches.
A 1D-univariate example has shown that bootstrap EMD can decompose a signal into its original FM, AM and PM components with a performance comparable to classical EMD. Application to CIELAB color images has also demonstrated the effectiveness of the 2D-multivariate formulation. In particular, the resulting IMFs and residues are color-coherent, and crisp local details are extracted. The color spectra of the residues corroborated the apparent color shift toward global features. Furthermore, examination of the spatial spectra has revealed coordinated decomposition of all components. Importantly, spatial-spectral separation of the IMFs has been confirmed.
Continued exploration of this approach could consider incorporating bootstrap sifting with the symmetry penalty functions in references [39], [40], or replacing the multicomponent extrema with a 2D extension of the turning-tangent characteristic points of [49]. It would also be interesting to explore the application of bootstrap EMD of color images to sparse representation and machine learning [41].