Morphological Component Analysis of fMRI Networks Using Sparse Dictionary Learning

— Objective: In recent years, sparse representations have been utilized as a data-driven method for identifying functional connectivity of networks. On the other hand, studies have shown that the assumption of independence among the network sources turns out to be surprisingly effective. Inspired by these controversial ﬁndings, we propose a method for sparse decomposition of fMRI data based on Morphological Component Analysis and the K-SVD dictionary learning — MCA-KSVD — and provide extensive comparisons with conventional ICA. Methods: Speciﬁcally, fMRI signals are decomposed into morphological components which have sparse spatial overlap and a K-SVD algorithm is deployed. In contrast to the statistical independence assumption of ICA, allowing sparse spatial over- lap between components provides a more physically plausible assumption. We present cases when the independency assumption fails and the proposed MCA-KSVD recovers more accurate spatial-temporal structures. Guided by random matrix theory, a parameter selection procedure of the algorithm is proposed. Results: MCA-KSVD was applied to resting-state and task fMRI on human volunteers. Validations prove that the method can identify both task-related and resting-state functional networks, resembling those of the ICA. Reconstructions and autocorrelation functions show that the sparsity constraint has a distinctive localization effect yielding sparser resolved networks with higher spatial resolution while suppressing weak signals. Conclusion: While marginally improving the decomposition, MCA-KSVD method denoises fMRI data much more effectively than ICA in a wide range of SNRs, preserving network structures. Signiﬁcance: The proposed method can be used as an alternative method for investigating functional connectivity of the brain.


I. INTRODUCTION
Resting-state fMRI has become a powerful and widely used noninvasive imaging technique for measuring temporally correlated low-frequency activity fluctuations in the blood oxygenation level-dependent (BOLD) signal in spatially remote brain areas during rest [1]- [3]. Sets of brain areas that undergo such correlated slow fluctuations in the BOLD signal are termed resting state networks, revealing ongoing functional connectivity between the networks during rest. Independent Component Analysis (ICA) is one of the most widely applied model-free methods for identifying functional networks [4]- [8]. The method finds spatial or temporal components by maximizing a mathematical statistical independence among components by either maximizing the non-Gaussianity [9] or minimizing the mutual information [10]. Spatial ICA in particular has become the most popular approach for decomposing brain activity components in fMRI, with many of the components shown to correspond to previously identified large scale brain components [11]. However, the statistical independence assumption of the ICA is not physically plausible and its validity is largely still open for debate [12]- [16], as there is no physical reason for the functional networks to correspond to different activity patterns with independent distributions. Interestingly, the possibility of a more efficient representation using sparse modeling was studied in several works [17]- [20], motivated by the assumption that the structure of the resting state functional brain connectivity is sparse. It has been demonstrated through various models that sparsity is an effective assumption for fMRI signal separation. For example, a hybrid SACICA method consisting of wavelet packet decomposition utilized to form the sparse approximation coefficients, followed by the ICA decomposition was proposed in [19]. However, a small quantity of active pixels was not detected by the method, and further research is needed for improving the proposed information fusion. A different approach was investigated in [18], where the sparse learning problem was formulated on the matrix of correlations across multi-subjects rather than directly on the data. The work showed that the whole-brain functional connectivity can be concisely represented with highly modular, overlapping pairs of sub-networks. A similar sparse dictionary learning approach was proposed in [17] but for task-related paradigms, where the task-related activation was assumed to have a sparse distribution and therefore the sparse dictionary learning was applied to estimate a spatially adaptive design matrix and corresponding sparse signal components. A sparse regressions model over the linear coefficients of the GLM combined with an enhanced Gibbs distribution function that captures spatial constraint has been proposed in [21].
While there are numerous studies investigating the usage of sparse modeling for fMRI signal decomposition, only a few aimed to infer a comprehensive collection of functional networks, with the focus to study connection between the sparsity constraint and the independency constraint of the conventional ICA method and their impacts on the decomposed functional network connectivity. Specifically, the effect of the sparsity constraint was analyzed mathematically in [15], [16] with controversial findings on the over-simplified distribution data sets, while sparse coding algorithms were applied to experimental fMRI data for extraction of functional networks in a few studies [20], [22], [23], but the mechanism and conditions underlying the success of the sparsity versus independency assumptions was not the focus and therefore, remains unknown. Interestingly, it was shown in [4], [15], [17] that the most influential factor for the success rate of the ICA algorithm is actually the sparsity of the components, rather than independence. In fact, ICA was found to be efficient in decomposing fMRI networks particularly when the spatial activation patterns of the networks are sparse, in which case the assumption of sparseness could significantly improve the signal separation accuracy and the computation efficiency of existing ICA algorithms [24].
In this study, we propose a sparsity-based decomposition method, coined MCA-KSVD, for identifying functional brain networks directly from the acquired task-based or resting-state fMRI data. Instead of ICA, this method uses Morphological Component Analysis (MCA) combined with the dictionary learning method, K-SVD. We investigate the impact of the sparsity versus independency constraints on the quality of the reconstructed fMRI networks. More specifically, the proposed method estimates spatio-temporal components by decomposing the resting-state fMRI data into morphological components [25]- [27] which have sparse spatial overlap. Based on the spatial patterns of the common functional networks such as Default Mode Network (DMN), visual, auditory, sensorimotor and executive control networks, we notice that the number of simultaneously active functional brain networks can be assumed to be sparse across voxels. Allowing such a sparse spatial overlap between network components is a more physically plausible assumption to the statistical independence assumption of the ICA method. The MCA decomposition was originally proposed for image inpainting and texture separation [26], and shares similarity to the problem of dictionary learning [23]. The K-Singular Value Decomposition (K-SVD) algorithm [28] was used to train a sparse dictionary to better fit the data.
Our preliminary results were reported in [29]. This paper presents a more detailed description and a more thorough evaluation of the MCA-KSVD method, focusing on analyzing the effect of the spatial sparsity constraint of the MCA method versus the independency constraint of the conventional ICA method. We provide such a detailed comparison both on fMRI simulations and experimental task and resting-state fMRI data. Inspired by the mathematical analysis work [15], we simulate cases when the independency constraint of the ICA method fails and the MCA-KSVD method yields a more accurate decomposition of networks. We also show that the statistical independency constraint in the ICA method in many cases actually implies the spatial sparsity condition, therefore confirming the findings in [4], [15], [17], [24]. On the other hand, the sparsity constraint of the MCA method is a more relaxed constraint, as the MC components are not required to be statistically independent, as evident from the correlation matrices. Due to a distinctive signal localization effect, the MCA-KSVD method yields sparser spatially resolved networks while suppressing weak signals such as those in the region of moving eyes. As a result, the MCA-KSVD method was found to have a better denoising performance than the ICA method, in terms of achieving a higher noise reduction factor while preserving spatial-temporal characteristics of the decomposed fMRI networks. Spatial resolution of the fMRI networks reconstructed from the MCA-KSVD method is higher than those of the ICA method by a factor of about 1.4 in our studies. On the other hand, inappropriate choice of the sparsity parameter in the MCA-KSVD method can result in either the signal leakage artifact or an extreme signal localization effect. We propose a reasonable choice of the sparsity parameter and the number of components to be decomposed, guided by the rank of the data matrix and random matrix theory. In overall, we suggest that the sparsity-based decompositions such as the proposed MCA-KSVD method can be used as an alternative method for identifying functional brain networks in both task and resting-state fMRI, especially when the data is contaminated by noise.

A. MCA-KSVD method
Consider the acquired fMRI data represented as a spatialtemporal function {s(r n , t m )} N,M n=1,m=1 . The goal of conventional data-driven methods is to detect the patterns of connectivity between brain regions by decomposing the measured noisy N × M data matrix formed from s(r, t) as where S 0 is the noiseless data matrix, D is the dictionary with the k-th column d k being the time series (or also referred to as atoms) of the k-th decomposed component; the k-th row x (k) of the matrix X represents the spatial variation of the k-th decomposed component, from which functional connectivity of the brain can be inferred; E is the noise matrix. We further assume that the acquired fMRI data admits a sparse approximation over the dictionary D, that is one can find a linear combination of a few atoms d k that is close to the data, i.e. S = k d k x (k) , therefore resulting in the following optimization problem [28]: (2) This problem is also referred to as the sparse coding and its formulation is different than that in [23]. The sparsity constraint in Eq. (2) implies that the decomposed signal sources can spatially overlap, which is a valid assumption for commonly identified BOLD functional networks. Specifically, the mathematical constraint ||x i || 0 ≤ L implies that there are at most L rows in X that contain nonzero elements, or in other words, matrix X is L joint sparse. As a direct interpretation of this mathematical constraint, we assume no more than L components are simultaneously activated at each spatial voxel. This assumption is supported by the findings that a neural response is sparse, i.e. a sparse set of neurons encode specific concepts, rather than responding to every input [17]. In practice, L is smaller than the number of sources K (typically, about two networks simultaneously active at each spatial voxel, out of eight to ten common functional networks that are anatomically separated [30]). The optimization problem (2) can be related to the Morphological Component Analysis (MCA) used to solve the blind source separation problem for image inpainting, except that d k are the atoms constituting the dictionary D rather than the dictionaries themselves [26]. The non-smooth convex optimization problem (2) is solved using the K-SVD algorithm iteratively in two stages [28]. In the sparse encoding stage, the spatial matrix X is computed using any pursuit method [31]- [33]. In the second stage, each dictionary element or atom d k is updated sequentially by finding a residual matrix choosing only the columns of E k that initially used d k in their representation, and finding a rank-one approximation using the SVD [18]. Once the matrices D and E are estimated, the spatial components reflecting functional connectivity are the rows of E.
Before applying the K-SVD algorithm, we propose to prewhiten fMRI data and initialize the dictionary as eigenvectors of the covariance matrix SS T , combined with random initialization. Selection of the number of components K and the sparsity parameter L used in the MCA-KSVD algorithm can be predicted by the rank R of the noiseless data matrix S 0 . Specifically, the number of components K is suggested to be equal to the rank R, while a reasonable approximate choice for the sparsity parameter L can be guided by K+1 2 , as found in our studies. Rank R can be determined using the results from random matrix theory as described in Section III-C.

B. Proposed algorithm
Given the measured data s(r, t), expressed in the matrix form as in Eq. (1), the proposed iterative, greedy algorithm performs the spatial-temporal decomposition of the data as follows [28]: 1) Sparse coding: Set the number of components K and the sparsity parameter L. Pre-whiten fMRI data using the eigenvalue decomposition of the data covariance matrix SS T . Initialize the dictionary D as the first K left singular vectors of the matrix, plus some random initialization. 2) Repeat until convergence: • Sparse coding: Given the fixed estimation of the dictionary D, compute matrix X using any pursuit algorithms (such as matching pursuit or basis pursuit) by approximating the solution of • Dictionary update: Given the estimated X, update each column d k of the dictionary D as follows: -Define the group of indices ω k pointing to the data s i that use the atom d k , -Compute the overall decomposition error matrix as E k = S − j =k d j x j . -Restrict E k by choosing only the columns corresponding to ω k , and obtain E R k .
-Apply the SVD decomposition of E R k . Choose the updated atom d k to be the first column of U . The sparse coding and dictionary update steps of the algorithm are alternated until convergence is observed. The decomposed spatial and temporal distributions of the functional networks are then extracted from the estimated D and X, respectively.

III. ALGORITHM CONSIDERATIONS
A. Improved performance of MCA-KSVD This section shows a set of simulation studies in order to evaluate the proposed MCA-KSVD method versus the conventional ICA method in terms of decomposition accuracy. We consider a case of two source components with corresponding time series extracted from experimental data and simulated spatial distributions. The level of spatial independency between two components can be characterized by the independency index [15]. Specifically, two sources are statistically independent if their joint probability density functions can be decomposed as the product of each individual probability density functions By approximating the probability density functions as where #V i is the number of active voxels of the i-th source component, the independency index is defined as with #V being the total number of the voxels and #V 12 is the number of voxels in the joint set V 12 = V 1 ∩ V 2 . It follows then two source processes are independent if Ind = 0. Two components are said to be well-separated if #V 12 is tiny with respect to #V . In this context, the sparsity constraint of the proposed MCA-KSVD method enforces the condition that #V 12 = L. Source components are said to be sparse if #Vi #V 1. Note that this condition implies the sparsity of the spatial distribution of the source component itself rather than the sparsity of the overlap region between components that is used in the MCA-KSVD method. The condition (6) in fact implies that two sources are mathematically independent if sources are sparse and well-separated, as mentioned in [15]. This coincides with the observation that the ICA decomposition is known to be particularly effective when the brain patterns one seeks are spatially sparse, with negligible or no overlap [4].
We consider the first simulation case where spatial distributions of the sources are simulated with rectangular profiles, each having sparse localized spatial activations, with some minor overlap between the sources. The independency index in this case is 0.52, indicating that the spatial independency condition of the ICA method does not hold. Temporal series were extracted from that of the DMN and auditory networks obtained by performing the ICA decomposition on the BOLD (1) and a moderate level of Gaussian noise was added. ICA and MCA-KSVD methods were then performed on the resulting data set, with the number of components K set to 2 for both methods and the sparse factor L of the MCA-KSVD method being equal to 2. Figure 1(a) shows the resulting decomposition obtained from the ICA and MCA-KSVD methods. One can see that the first component decomposed by the ICA method experiences signal loss in the region of the overlap between two components, as indicated by arrows and corresponding error maps. The error maps were calculated as the difference between the decomposed component and the ground truth. In contrast, the MCA-KSVD method yields only a slight amount of signal loss, as seen from the error maps between the reconstructed spatial distributions and the ground truth. Temporal series of the first component reconstructed by both methods closely resembled the ground truth, but the temporal series of the second component estimated by the MCA-KSVD method was more accurate.
In the second case, spatial distributions of two source components were simulated by extracting spatial distributions of the commonly observable fMRI visual networks at two adjacent slices of the same BOLD task functional data used in the first case. The resulting spatial distributions of two components are quite similar, significantly overlapping, and thus leading to the failure of the ICA independency condition. Figure 1(b) shows that the ICA method failed to accurately decompose two components spatially, with a considerable amount of signal artifact in the region of overlap between two source components, while the MCA-KSVD method separated two components quite accurately. Temporal series of both components were estimated by the MCA-KSVD method more accurately than those of the ICA method.

B. Signal localization effect of MCA-KSVD
Choice of the sparsity parameter L in Eq. (2) affects the signal localization effect observed with the MCA-KSVD method. Specifically, the algorithm supresses weak activations in the extracted networks, leading to the visual effect of localized spatial distributions of the networks, as observed in other studies [17], [20], [34]. This localization effect becomes enhanced as L is decreased and is the most prominent when L = 1. To further investigate the effect of the sparsity parameter on the resulting decomposition, we consider BOLD task functional data acquired at the field strength of 3T with T R = 2 s, T E = 30 ms, 255 time frames, FOV = 22 cm, 8 coils obtained using auditory stimulation with a spiral acquisition. Spatial ICA was performed with 20 components and four common networks were extracted: Visual 1, Visual 2, Auditory and DMN. These spatial distributions and their corresponding temporal series were then mixed according to Eq. (1) and a significant amount of additive Gaussian noise was added, yielding the SNR of data at the level of −2.74 dB. The resulting data were passed to the spatial ICA and MCA-KSVD decompositions with sparsity parameter L = 1, 2. Figure 2 shows decomposed networks with corresponding reconstructed temporal series. Both MCA-KSVD and ICA methods recovered typical spatial distribution patterns of the fMRI networks well, except for the case of auditory network reconstructed from the ICA method, which had some false increase of the overall signal energy. In the case of the MCA-KSVD, the method suppressed weak signal activations, as expected from the signal localization effect of the sparsity constraint employed in the method. Temporal series of the MCA-KSVD in this simulation study were consistently matching the ground truth better than those of the spatial ICA method.
The pair-wise correlation matrix of the networks is shown in Fig. 2(c). The correlation matrix obtained from the MCA-KSVD method with L = 2 demonstrates that in contrast to the ICA method, the MCA-KSVD decomposition does not require its components to be uncorrelated. Hence, the MCA-KSVD sparsity constraint is considered as a more relaxed assumption imposed as compared to the independency constraint of the ICA method. As the sparsity parameter decreases and in the extreme case of L = 1, the correlation matrix of the MCA-KSVD method was found to be even more uncorrelated than that of the ICA method. This supports the observation in [15] that a sparse and well-separated component is always nearly independent.

C. Choice of parameters
A proper selection of the number of components to be decomposed K affects the performance of both MCA-KSVD and ICA methods, as with any other decomposition methods. Too small K results in signal leakage of some network into another, and too large K results in a signal splitting effect, i.e. one functional network is split into several components. In addition to the number of components, the sparsity parameter L defined in Eq. (2) plays an important role in the performance of the MCA-KSVD method. As pointed out earlier, this parameter affects the localization effect of the MCA-KSVD. Interestingly, selection of both parameters, the number of components K and the sparsity parameter L, can be predicted by the rank R of the noiseless data matrix S 0 . In fact, it has been pointed out in [35] that the data matrix rank plays a crucial role in ensuring the uniquiness of the L-joint sparse solution of Eq. (2). Specifically, we notice from Eq. (1) that rank(S 0 ) ≤ rank(X).
In practice, for fMRI network decomposition problem specifically, the number of components K is much less than the number of voxels N , so rank(X) = K. It then follows that Therefore, a reasonable choice of K can be made by matching it to the rank R of the noiseless data matrix S 0 . In case the data matrix is contaminated by additive Gaussian noise, the N × M matrix S is of full rank. In this case, the rank R can be determined using the results from random matrix theory. Specifically, we first estimate the complex noise standard deviation σ 0 from the measured data and use σ 0 to estimate the spectral norm of the noise matrix ||E|| 2 based on Marchenko-Pastur distribution of the eigenvalues of E H E [36] as with β = N/M . The rankR is then chosen so that where λ l are the singular values of S.
To evaluate performance of the described rank selection method, we performed the proposed rank selection on the simulation dataset with the rank of the noiseless data matrix R = 4. A Monte-Carlo study was performed at 5 different noise levels, with 64 noise realizations per each noise level. The spectral norm ||E|| 2 and the effective rank were determined according to Eqs. (9,10). Table I shows the mean effective rankR, averaged over all the noise realizations. Notice that the norm ||E|| 2 was closely estimated at every SNR level, as can be seen from the reported relative error ||E||2−||Ê||2

E||2
. Therefore, the rankR (hence the number of components K) was also closely estimated to the true value. The proposed rank selection procedure was observed to achieve high accuracy even in severe noise cases as SNR of the measured data significantly dropped.
As for a reasonable choice for the sparsity parameter L, it is well known that a necessary and sufficient condition for the measurements s = Dx to uniquely determine each L-sparse vector x is given by [35] where the spark of the dictionary matrix D is defined as the smallest number of columns of D that are linearly dependent.
In context of the fMRI network decomposition problem, the acquired fMRI time series (or columns of D) are typically independent, so the spark of the matrix is the number of columns plus one, therefore We suggest to initially estimate L as K+1

2
, then keep reducing L until the signal leakage effect is not observed in the extracted components. In the extreme case of underestimating L, one will notice severe signal suppression effect. For the simulation study shown in Fig. 2 the proposed selection approach resulted in the optimal value of L = 2.

IV. RESULTS AND DISCUSSIONS
A. Experimental data MCA-KSVD analysis was applied to both task and restingstate fMRI on human volunteers. BOLD task functional data (31 slices, 4 mm slice thickness, TR = 2 s, TE = 30 ms, 255 time frames, FOV = 22 cm, 8 coils) were obtained using auditory stimulation with a spiral acquisition. Task fMRI time Fig. 3. Task fMRI experiment: Spatial support of task activation obtained by correlating fMRI time series with the task paradigm convolved with canonical HRF (top row) and spatial support of active voxels from the corresponding ICA and MCA-KSVD components (middle and last rows). series were correlated with the task paradigm convolved with the hemodynamic response function, and thresholded with the value r = 0.2 in order to obtain the task activation map shown in Fig. 3. Spatial ICA and MCA-KSVD were then performed with K = 14 components. MCA-KSVD decomposition was applied with sparsity factor L = 5 and 100 iterations were conducted to ensure convergence of the algorithm. Both ICA and MCA-KSVD components were thresholded so that the number of displayed voxels was the same as that of the task activation map. It can be seen from Fig. 3 that the MCA-KSVD method is able to identify task-related network structures. Figure 4 shows spatial distributions of functional connectivity networks (visual 1, visual 2, auditory, and DMN) revealed by the ICA and MCA-KSVD methods. The networks were overlaid on the top of anatomical images, for a more convenient pattern analysis. The networks identified by MCA-KSVD show consistent spatial connectivity patterns, resembling those of the ICA. In the case of the visual 1 and visual 2 networks (first and second rows), false activations in the region of moving eyes (indicated by arrows) have been suppressed by MCA-KSVD, as compared to those recovered by ICA.
Resting-state fMRI was then acquired with TR = 2 s, TE = 30 ms, 240 time frames. Figure 5 shows the resulting ICA and MCA-KSVD spatial components when the number of components to be decomposed was set to K = 20 and the sparsity parameter L = 8. Resting-state networks such as the visual, sensori-motor, DMN1, DMN2, and Executive Control Network (ECN) are revealed by both methods. Thus, we conclude that the MCA-KSVD is able to resolve restingstate networks. To assess the spatial resolution of networks decomposed by the ICA and MCA-KSVD methods, we calculated 2-D autocorrelation function for spatial distribution of each decomposed network. Figure 5(b) shows 1-D plots of autocorrelation functions extracted in the middle of the horizontal direction and averaged over slices. One can see that autocorrelation function of the visual network reconstructed by the ICA method has a larger width of the main lobe than that reconstructed by the MCA-KSVD method. This implies that spatial resolution of MCA-KSVD components is higher than that of the ICA in the case of the strong visual network. For the other components, the spatial resolution yielded by both methods was about the same. As the sparsity parameter decreases, the width of the main lobe of the autocorrelation function decreases accordingly, due to a stronger signal localization effect. Figure 6 shows MCA-KSVD versus ICA decompositions on the second resting-state fMRI data set, acquired on Siemens 3 T Prisma using a vendor supplied 32-channel coil, TR = 1.03 s, TE = 31 ms, 42 slices, 475 time frames. Parameters for the methods were chosen to be similar to the previous experiment, K = 20 and L = 8. We note that the restingstate networks such as the visual, auditory, DMN, and ECN networks are again revealed by both methods. The spatial patterns of MCA-KSVD components closely resemble those of the ICA, except for an additional signal localization effect due to the sparsity constraint employed by the method. This effect suppresses weak signals, preserving strong activations in the decomposed networks. As an example, the auditory network decomposed by the MCA-KSVD method had signal activations in the expected region of auditory cortex while activations in the other regions such as the region of cerebrospinal fluid (indicated by arrows) were suppressed as compared to those decomposed by the conventional ICA. Results obtained in this experiment are consistent with the results of previous experiments and demonstrate that the sparsity-based method MCA-KSVD marginally improves the fMRI network connectivity decomposition and exhibits a distinctive signal localization effect due to the sparsity constraint employed in the method. This effect suppresses weak signals (such as false signal activations in the region of moving eyes or CSF) in contrast to the conventional ICA method, and is advantageous in the case of noisy fMRI data. Autocorrelation functions shown in Fig. 6(b) confirms again that the spatial resolution of components resolved by the MCA-KSVD method is overall slightly higher than those decomposed by the ICA method. Specifically, in the case of the visual network the spatial resolution was higher by a factor of about 1.4.

B. Denoising performance
To analyze the denoising performance of the proposed MCA-KSVD method and compare it with the conventional spatial ICA method, we re-visit Eq. (1) where the measured data matrix is modelled as S = S 0 + E and denote the denoised dataS = S 0 +∆, where S 0 is the data matrix in the noiseless case, E is the noise matrix, and ∆ is the residual noise, which can contain both noise and possible signal loss caused by the decomposition of functional networks. To characterize the noise level, we define the SNR of the measured data as SNR = 10 log 10 P s ||E|| F , where P s is the signal power and ||E|| F is the Frobenius norm of the noise matrix. To evaluate the ability of the described MCA-KSVD method to handle noise and compare it with performance of the conventional ICA method, we performed both decompositions on the simulated dataset with ground truth Visual 1, Visual 2, Auditory and DMN networks extracted from the experimental dataset described in Section III-B. BOLD task functional data with 31 slices, 4 mm slice thickness, TR = 2s, TE = 30 ms, 255 time frames, FOV = 22 cm, 8 coils) were obtained using auditory stimulation with a spiral acquisition. Spatial distributions and time series of the ground truth networks were obtained by performing the spatial ICA method. Gaussian noise at different noise levels was then added to the resulting simulated data. In order to have a better performance of both methods in the case of severe noise, the number of components to be decomposed was slightly overestimated to be K = 6, and only 4 functional networks whose spatial patterns could be identified were retained. For the MCA-KSVD method, the sparsity parameter was chosen to be L = 2.
To quantitatively estimate the denoising performance of both methods, we define the noise reduction factor as computed in the the background region outside of the field of view of the brain, in order to ensure ∆ contains only residual noise. Ideally, one would want g = ∞. In the case of no denoising achieved, g = 1. Table II shows noise reduction factor obtained from both ICA and MCA-KSVD methods at four different noise levels. It is seen that in all cases, the noise reduction factor of the MCA-KSVD method is on the average approximately 1.4 greater than that of the ICA method, implying that the proposed MCA-KSVD method has a better denoising capability than the conventional ICA method. The better denoising performance of the MCA-KSVD method is due to the effect of suppressing weak signals imposed by the sparsity constraint.

V. CONCLUSIONS
This paper has presented a data-driven method, MCA-KSVD, exploiting sparse representations for functional connectivity detection for both task and resting-state fMRI. The method decomposes fMRI data into the time series and spatial components which have sparse spatial overlap, unlike the conventional ICA analysis. A detailed comparison between the proposed sparsity constraint and the statistical independency constraint of the conventional ICA method has been presented on both simulated and experimental task and resting-state fMRI data. We found that the independency constraint of the ICA method holds when sources are sparse and wellseparated. In other cases comprising spatial overlap between the functional networks, we demonstrated that the proposed MCA-KSVD method decomposes the networks more accurately than the ICA. Experimental results prove that the MCA-KSVD method can identify functional networks in both task and resting-state fMRI resembling those of the ICA, while not requiring the decomposed components to be uncorrelated. In addition, the sparsity constraint exhibits a distinctive signal localization effect, suppressing weak signal activations while preserving strong signals. Due to this effect, the MCA-KSVD method yields a better denoising performance than the conventional ICA in terms of preserving spatial-temporal characteristics of the decomposed functional networks while achieving a higher noise reduction factor in a wide range of SNR. We conclude that the MCA-KSVD method should prove useful as an alternative method for investigating functional connectivity of the brain, especially in the presence of noise.