Transcranial Phase Correction Using Pulse-Echo Ultrasound and Deep Learning: A 2-D Numerical Study

Phase aberration caused by human skulls severely degrades the quality of transcranial ultrasound images, posing a major challenge in the practical application of transcranial ultrasound techniques in adults. Aberration can be corrected if the skull profile (i.e., thickness distribution) and speed of sound (SOS) are known. However, accurately estimating the skull profile and SOS using ultrasound with a physics-based approach is challenging due to the complexity of the interaction between ultrasound and the skull. A deep learning approach is proposed herein to estimate the skull profile and SOS using ultrasound radiofrequency (RF) signals backscattered from the skull. A numerical study was performed to test the approach’s feasibility. Realistic numerical skull models were constructed from computed tomography (CT) scans of five ex vivo human skulls in this numerical study. Acoustic simulations were performed on 3595 skull segments to generate array-based ultrasound backscattered signals. A deep learning model was developed and trained to estimate skull thickness and SOS from RF channel data. The trained model was shown to be highly accurate. The mean absolute error (MAE) was 0.15 mm (2% error) for thickness estimation and 13 m/s (0.5% error) for SOS estimation. The Pearson correlation coefficient between the estimated and ground-truth values was 0.99 for thickness and 0.95 for SOS. Aberration correction performed using deep-learning-estimated skull thickness and SOS values yielded significantly improved beam focusing (e.g., narrower beams) and transcranial imaging quality (e.g., improved spatial resolution and reduced artifacts) compared with no aberration correction. The results demonstrate the feasibility of the proposed approach for transcranial phase aberration correction.


I. INTRODUCTION
T RANSCRANIAL ultrasound could enable a wide vari- ety of applications, such as brain imaging [1], [2], stroke diagnosis [3], intracerebral hemorrhage detection [4], brain perfusion evaluation [5], and endonasal trans-sphenoidal surgery [6].Compared with other imaging modalities (e.g., computed tomography (CT) and magnetic resonance imaging), ultrasound imaging has the intrinsic advantages of being real-time, affordable, portable, noninvasive, and nonionizing.However, a large speed of sound (SOS) mismatch between the human skull and the brain [7], [8] combined with the strong scattering and absorption [9] severely distort the ultrasound wavefront, leading to highly degraded ultrasound images.In particular, skull phase aberration is a major limiting factor for the transcranial application of ultrasound.
Many phase correction techniques have been developed for transcranial ultrasound.Phase correction can be accomplished by using the experimental reference signals or a known skull profile and SOS.The experimental signal-guided techniques include the representative time-reversal method [10], [11], [12], which was initially presented as a wave-mirroring method by Fink [10].In this method, the transcranial temporal pressure waveform is stored, reversed, and re-emitted, providing both phase and amplitude correction.However, this particular time

Highlights
• A deep learning method was proposed to estimate the skull profile and SOS using ultrasound echo signals from the skull based on which a phase aberration correction approach was developed.
• Aberration correction using deep-learning-estimated skull profile and SOS showed significantly improved quality for transcranial imaging and beam focusing in this numerical study.
• This study demonstrated the feasibility of the new skull aberration correction approach, providing a potentially practical method to improve transcranial ultrasound imaging and therapy.
reversal method is not practical because it requires a sensor inside the brain.Vignon et al. [13] made further improvements by placing two identical linear arrays, one on each side of a skull and using a spatiotemporal inverse filter to estimate the phase aberration.Some experimental signal-guided methods [14], [15], [16], [17], [18] used a near-field phase-screen model coupled with a multilag, least-squares, and correlation-based phase error algorithm.These methods assume a negligibly thin aberrator, which serves as an approximation for the more complex aberration profiles such as a skull.Thus, the application of these methods is mainly limited to the temporal bone, where the skull is thin (the mean thickness reported to be 3-4 mm).In addition, Clement et al. [19] induced a shear mode for brain imaging as the shear wave speed in the skull is close to the longitudinal sound speed of brain tissues and provides less refraction, but its applicability has been restricted by extremely high attenuation of shear waves in the skull [20].
Approaches modeling refraction based on the accurate geometry and SOS of the skull have been proposed.Some studies [21], [22] have extracted skull information from CT images based on the empirical correlation between Hounsfield Units and SOS.Although this CT-based approach has demonstrated success in brain therapy, it is less suitable for ultrasound brain imaging due to the added ionizing radiation associated with additional CT scans, which contradicts the goal of ultrasound imaging.There is a need to develop a robust approach to extract the skull profile and SOS by directly using ultrasound, preferably with the same imaging probe.Wydra et al. [23] introduced the variable focus technique and measured the thickness and SOS by searching for maximum reflected amplitude.This method requires clearly distinct reflected echoes from the near and far surfaces of the skull.Mozaffarzadeh et al. [24] used a single probe and the bidirectional headwave technique to estimate the compressional wave speed in the skull, focusing specifically on the human temporal window.Existing ultrasound-based methods for extracting the skull profile and SOS explicitly model the interactions between ultrasound and the skull.While promising, these methods often require simplified assumptions such as the skull being a homogenous aberrator or distinct reflected echoes from skull surfaces being available.These simplified assumptions may limit the ultimate capability of the methods that are based on explicit modeling of physics.The interactions between ultrasound and the skull are complex, and explicitly modeling the physical process is challenging.Data-driven approaches, such as deep learning may be a promising alternative to avoid explicit modeling of complex physics while achieving accurate results [25].
A deep learning method is proposed herein to estimate the skull profile and SOS with ultrasound radiofrequency (RF) signals backscattered from the skull.Various studies have shown that deep learning can effectively extract tissue properties from raw RF data (see [26], [27]).In this article, a deep learning model is developed to estimate skull thickness and average SOS along each scan line.The resulting skull thickness and SOS values will allow the reconstruction of the skull profile and the lateral distribution of the SOS that can be subsequently used for phase aberration correction using a fast marching method (FMM) [28].
This numerical study aims to test the feasibility of our deep-learning-based phase correction approach for pulse-echo ultrasound.This article is organized as follows.Section II describes our proposed approach and the methodology used for this numerical study.In Section III, phase correction results for focusing and B-mode imaging are presented.Sections IV and V show the discussion and conclusion of this article, respectively.

II. METHODLOGY
Our approach consists of the following steps: 1) Skull Outer Surface Detection: A standard synthetic aperture scan [29] is performed on the skull using an array transducer to detect the skull's outer surface.
2) Skull Thickness and SOS Scan: Without changing the transducer location, a second ultrasound scan is performed to acquire RF signals backscattered from the skull for skull thickness and SOS estimation.The ultrasound beam is focused inside the skull for this scan.The acquired RF data are to be used by a deep learning model to estimate the skull thickness and SOS.The deep learning model needs to be trained a priori.
3) Skull SOS Map Reconstruction: Combining the results from steps 1) and 2), a skull SOS map is reconstructed to represent the skull's inner and outer surfaces and the lateral distribution of skull SOS.Our phase aberration correction method does not require the axial distribution of skull SOS.

4) Time Delay Computation:
The FMM is applied to the skull SOS map reconstructed in step 3) to compute the time delay to be applied to each transducer element for aberration correction.The time delay information will be used subsequently to deliver focused acoustic energy to the brain for therapy or to perform aberration-corrected transcranial imaging of the brain.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
All ultrasound data acquisitions involved in the above steps can be performed using the same array transducer scanning the same skull location, allowing ease of use and potential real-time implementation of this approach.
The feasibility of this approach is evaluated herein through a numerical study, where acoustic simulations are performed on realistic skull models derived from the CT scans of five ex vivo skulls.The simulations were performed using the k-Wave toolbox [30], which has been shown to be accurate and efficient for transcranial ultrasound simulations [31].Shear waves were not considered because the ultrasound beam was mostly perpendicular to the skull surface in our simulations and shear waves are strongly attenuated in the skull in the frequency range considered.It has been demonstrated that skull phase errors due to shear waves can be negligible on the focusing for angles of incidence up to 20 • [32].Two-dimensional simulations were performed for computational efficiency.The performance of the proposed phase aberration correction approach was demonstrated in terms of the accuracy in skull thickness and SOS estimates, and the effectiveness of phase aberration correction in transcranial focusing and imaging.

A. Numerical Skull Models
Numerical skull models were derived from archived diagnostic CT scan data of five ex vivo human skulls.CT scans were conducted using a Siemens SOMATOM CT scanner, utilizing a bone reconstruction kernel (AH82) to obtain image intensities directly proportional to bone density.The acquired CT images featured a slice thickness of 1 mm, a spatial resolution of 0.39 mm/pixel, and a size of 200 × 200 mm [Fig.1(a)] [33].The CT images were processed to segment the skull from the background [Fig.1(b)].Density and SOS maps of the skull were created from the segmented CT images by assuming linear relationships [34], [35] between the CT Hounsfield Unit values and SOS and density values.The density and SOS were set to be 1000 kg/m 3 and 1500 m/s, respectively, for the background.
Simulations were performed on skull segments rather than the entire skull to mimic the conditions of real applications where the ultrasound transducer has a finite footprint covering a small region of the skull, as well as to minimize computation.The skull density and SOS maps were segmented into 40mm-long segments [Fig.1(c)], with a 20-mm overlap between adjacent segments.The transducer was placed at a fixed location in the simulation, with the transducer surface parallel to the horizontal direction.Each skull segment was rotated to make its outer surface as parallel to the horizontal transducer as possible [Fig.1(d)].We obtained 3595 2-D segments from five ex vivo human skulls.
The spatial resolution of the skull density and SOS maps was increased from 0.39 to 0.10 mm/pixel via image interpolation for more accurate acoustic simulations at the frequency of interest.A nearest neighbor interpolation was used to preserve the discontinuity of the density and SOS values at the edges of the skull.Acoustic simulations were performed on each segment after interpolation.
The skull attenuation was considered to better model the realistic conditions in the simulation.A spatially homogeneous  skull attenuation coefficient of 17 dB/cm at 1.5 MHz and a power law frequency dependence with an exponent of 2 were used in the simulation.Spatially homogeneous attenuation for the skull has been shown to result in a good agreement between simulations and experiments [36].A study [9] measured the attenuation of the skull bone with a center frequency of 1 MHz and determined the average attenuation coefficient to be 13.3 ± 0.97 dB/cm.Another study [37] investigated the attenuation coefficient of freshly excised human skulls and found the attenuation coefficient to be 27 dB/cm at 1.4 MHz for cortical bones.

B. Simulation Settings
A linear array was used to transmit and receive ultrasound waves in all of our k-Wave simulations.The central frequency was 1.5 MHz, and the pitch was 0.8 mm, equivalent to 4/5 of the wavelength at the central frequency.The frequency response of the array was modeled as a Gaussian function with an 80% bandwidth for transmission and reception.More simulation details are included in Supplementary Materials (Table S1).The noise was considered in our simulation to investigate its effects on the performance of our proposed phase aberration correction approach.Simulations were performed to image point targets through the skull.Two noise levels were considered: 1) zero noise [e.g., Fig. 2(a)] and 2) a noise level that allowed visualization of point targets through the skull, where the signal-to-noise ratio (SNR) was 10 dB, calculated by assuming the "signals" to be those from the point targets we simulated [e.g., Fig. 2(b)].

C. Ultrasound Data Acquisitions for Skull Outer Surface Detection and Skull Thickness and SOS Estimation
A standard synthetic aperture approach [29] was applied to detect the outer surface of the skull.In each emission of synthetic aperture imaging, a single element was used to transmit a wave covering the full image region, and all the elements were to receive the echoes.After all emissions, the delay-and-sum (DAS) algorithm [38] was used to generate B-mode images [Fig.3(a)] that allowed us to identify the outer surface of the skull [Fig.3(b)].
The ultrasound data acquisition process for skull thickness and SOS estimation is described as follows.The simulated array transducer consisted of a total of 50 elements, with 15 active elements simultaneously excited to form a beam focused at 5 mm below the transducer surface (inside the skull), generating a virtual scanline [red dashed line in Fig. 4(a)].Electronic scanning was performed by continuously shifting active elements by an index of 1 to move the scanline.For each scanline, the backscattered RF signals at all active elements were recorded, yielding a total of 15 channels of RF signals to be used by a deep learning model to estimate the local skull thickness and SOS for that scanline.The width of the virtual scanline [the narrow black rectangle in Fig. 4(a)] was equal to the pitch of the transducer by definition.However, the ground truth values of the local thickness and SOS for each scanline were calculated by spatially averaging the thickness and SOS of the entire skull region below the active elements [i.e., the skull region within the two white dashed lines in Fig. 4(a)].This spatial averaging procedure served to reduce variance, resulting in a homogenized ground-truth SOS map.
In real-world scenarios, the transducer would touch the human scalp (typical thickness, 1 mm [39]) during ultrasound data acquisition.Therefore, the transducer was placed 1 mm above the skull in the simulations to account for the thickness of the scalp.Two linear layers were used before the first convolutional layer to take the physics of RF channel data into account (Fig. S1 of Supplementary Materials).The input data were represented by a 30 × 64 matrix, corresponding to 30 channels × 64 data points per channel.Without linear layers, the convolutional layer would treat each channel equally and ignore the phase information and the correlation between signals received by array elements.However, the complexity of the scattering of the skull in the near field made it difficult to explicitly find the physical connection between input channel signals.The two trainable linear layers addressed this issue by linearly transforming the channel data to implicitly model the physical connection between signals from the 15 array elements.The first linear layer rearranged the 64 data points and the second linear layer processed the 30 channels.Matrix multiplications in the frequency domain enabled the computation of signals in the time domain (such as linear combination, time shift, phase shift, and convolution).The two linear layers served as a preprocessing step before the first convolutional layer and were expected to improve model performance by identifying the optimal linear transformation of the channel data.
BN [40] and dropout [41] were used in the second convolutional block.The BN layer was used to reduce the shift of internal covariance and accelerate the training process of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the deep neural network.Dropout was applied to prevent the network from overfitting.
For deep learning model training and evaluation, the 3595 skull segments were split into three sets, the training set, the validation set, and the test set, which were correspondingly 60%, 20%, and 20% of the entire skull segments.To prevent data leakage, skull segments from the same CT image were assigned to the same set.
To evaluate the skull thickness and SOS estimation performance of the neural network model, the Pearson correlation coefficient [42] and mean absolute error (MAE) between the ground truth and estimated values were calculated.Pearson correlation coefficient describes the strength and direction of the linear relationship between two sets of data and equals unity for perfect estimates.MAEs represented the quantity difference and allocation difference between the ground truth and estimated values [43].

E. Reconstruction of Skull SOS Maps
The SOS map was reconstructed for each skull segment to represent the skull shape and the lateral distribution of the SOS.The outer skull surface was determined by the synthetic aperture scan as described in Section II-C.Adding the deeplearning-estimated thickness to the outer surface yielded the inner surface of the skull.The lateral distribution of the SOS was obtained from the deep-learning estimates.The deep learning model yielded 36 local thickness values and 36 SOS values that could be used to construct 36 consecutive scanlines for each skull segment, representing the lateral distribution of thickness and SOS.Considering the width of the scanline (= 0.8 mm), 36 consecutive scanlines spanned a total length of 28.8 mm, which was shorter than the full length of the skull segment.Therefore, the thickness and SOS estimates of the leftmost (rightmost) scanline were extrapolated using the same estimated values to the left (right) edge of the skull segment to cover the full length.

F. Phase Aberration Correction
Transcranial focusing and imaging were simulated to assess the performance of phase aberration correction.
1) Focusing and B-Mode Imaging Simulation: Transcranial focusing and B-mode imaging were simulated with the array placed 1 mm above the skulls, the same as the simulations of acquiring RF signals for SOS and thickness estimation.While the beams were focused at 5 mm for thickness and SOS estimation, the focus was 30 mm for transcranial focusing and imaging.For imaging purposes, each focused beam was generated using 32 active consecutive elements (instead of 15 active elements as used in SOS and thickness estimation) to have a better lateral resolution.The adjustments resulted in a total of 19 focused beams and an f-number of 1.17.The spatial and temporal resolutions of the simulations matched those in the RF signal acquisition for SOS and thickness estimation.In the imaging process, 12 point targets with a radius of 0.3 mm were placed 20, 25, 30, and 35 mm below the array.The density of point targets was 4000 kg/m 3 , and the SOS was 2000 m/s.
2) FMM: FMM [44] was used to correct the phase aberration caused by skulls.The FMM computed the numerical solutions of the eikonal equation that can be considered as an approximation to the wave equation: where T is the arrival time of the acoustic wave at a point, and c is the spatially varying SOS.The arrival times computed from the eikonal equation were directly linked to the desired time delay of the transducer element.We used FMM with the reconstructed SOS maps to calculate accurate transmit time delays to correct phase aberration for better focusing.Point targets were imaged through the skull with the corrected beam.FMM was then applied to calculate accurate receive time delays in the DAS algorithm during B-mode image construction, which yielded phase-corrected Bmode images.The FMM was performed using an open-source implementation in MATLAB and C++ [45], where Version 1.0 of the code was used herein without modification.

A. Deep-Learning-Based Skull SOS and Thickness Estimation
When the RF signals were noise free, our network yielded an MAE of 0.15 mm and a Pearson correlation coefficient of   6(a) and (c).
The linear layers added to the network improved the model performance as shown in Table I.For instance, the linear layers decreased the MAE from 0.24 to 0.15 mm for thickness estimation, and from 22 to 13 m/s for SOS estimation.
The SOS maps of skull segments were reconstructed using the deep-learning-estimated thickness and SOS [Fig.7(c) and  (d)].The profiles of reconstructed maps were close to the ground-truth maps.Two more examples of reconstructed SOS maps are presented in Supplementary Materials (Figs.S2 and  S3).Although the reconstructed SOS maps only provided the lateral distribution of skull SOS without considering the axial distribution, the phase correction performance was comparable to that of the ground truth SOS maps, as presented in Sections III-B and III-C.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.When noise was present in the RF signals, the MAE was 0.35 mm (5.6% of the mean ground truth), and the Pearson correlation coefficient was 0.95 in the test set for thickness estimation.For SOS estimation, the MAE was 39 m/s (1.7% of the mean ground truth), and the Pearson correlation coefficient was 0.47.Scatter plots between deep-learning-estimated values and the ground truth are shown in Fig. 6(b) and (d).
Similar to the noise-free case, the linear layers also improved the deep learning performance when noise was present (Table I).For thickness estimation, the linear layers reduced the MAE from 0.40 to 0.35 mm and increased the Pearson correlation coefficient from 0.93 to 0.95.For SOS estimation, although the MAE remained unchanged (39 m/s), the Pearson correlation coefficient increased from 0.44 to 0.47.

B. Transcranial Ultrasound Focusing
Correcting the phase aberration using the reconstructed SOS maps and the FMM significantly improved the focusing of ultrasound beams through the skull.An example is shown in Fig. 8, which compares the beam profiles of focused beams generated by using the central 32 active elements of the array under various conditions, including focusing without the skull [Fig.8(a)], focusing through a skull segment without aberration correction [Fig.8(b)], focusing through the same skull segment with aberration correction using ground-truth skull thickness and SOS values [Fig.8(c)], and focusing using deeplearning-estimated skull thickness and SOS values [Fig.8(d)].The focused beams were characterized using root-mean-square pressure values.The target focusing depth was 30 mm in this example.This focusing depth was accurately achieved when the skull was absent [Fig.8(a)].When the skull was present, as shown in [Fig.8(b)], the focal depth was shifted up and the quality of the focus was severely degraded without phase aberration correction, highlighting the severe distortion to the beam caused by skull phase aberration.These aberration effects were largely eliminated when phase aberration correction was performed [Fig.8(c) and (d)].The performance was similar between correction using ground-truth thickness and SOS and correction using deep-learning-estimated thickness and SOS [Fig.8(c) and (d)].The full-widths at half-maximum (FWHMs) along the lateral direction at the target focal depth of 30 mm were 1.50, 2.08, and 2.09 mm, for focused beams generated without the skull, with the skull and corrected using the ground truth SOS map, and with the skull and corrected by deep-learning-estimated thickness and SOS, respectively, demonstrating the effectiveness of phase aberration correction, although perfect phase aberration correction was not achieved (e.g., slightly wider beam after correction than the focused beam without the skull, as well as mild but noticeable side lobes in the beams after correction [Fig.9(a)]).
Similar to the noise-free cases, phase aberration correction improved the quality of transcranial focusing in the presence of noise [Fig.8(e)].The lateral profile of the focused beam with the skull and corrected by deep-learning-estimated thickness and SOS had an FWHM of 2.16 mm at focal depth, close to the profile corrected using the ground-truth SOS map [Fig.9  The quality of transcranial imaging in the presence of noise was also improved by our approach.B-mode images through the skull segment without aberration correction [Fig.11

IV. DISCUSSION
We developed a deep learning method to estimate the skull thickness and SOS using RF signals backscattered from the skull.The deep learning method accurately yielded the skull thickness and SOS estimates.The deep-learning-estimated skull thickness and SOS values were shown to be effective for phase aberration correction, as demonstrated through two potential use case scenarios, transcranial focusing and transcranial imaging.Phase aberration correction in transcranial focusing is important for a wide range of focused ultrasound applications, such as ultrasound neuromodulation [46], sonogenetics [47], sonothermogenetics [48], and transcranial high-intensity focused ultrasound (HIFU) ablation [49].Phase aberration correction in transcranial imaging is critical for applications such as brain anatomical imaging, brain vascular imaging, functional ultrasound, Doppler imaging, and photoacoustic tomography.
Several studies measured skull thickness and SOS using pulse-echo ultrasound.Wydra et al. [23] used a variable focus technique to measure the thickness of a flat thin phantom with an array probe at 2.25 MHz, achieving an error rate of 4.9%.Due to their singular ground truth value, MAE comparison is not directly applicable.Minh et al. [50] used a similar refraction-corrected multifocus imaging method and performed measurements on plate-shaped bovine bone samples at a higher frequency (5 MHz).They had a root-mean-square error (RMSE) of 0.94% for compressional sound velocity and 1.09% for thickness.For comparison, we recalculated our own RMSEs, which yielded 0.01% for SOS and 0.09% for thickness estimations.Wang and Jing [28] performed a numerical study using Wydra's approach to estimate the thickness and SOS of realistic skull models generated from CT scans of ex vivo human skulls.They assumed the skull to an SOS-homogeneous medium and had an error of 2% in estimating the average SOS with no noise in the RF signals.They also measured the varying thickness of a skull segment, resulting in a maximum error of 10% for thickness estimation with an average thickness of 4.5 mm.Utilizing the same ultrasound frequency of 1.5 MHz, our approach showed lower MAEs.Nonetheless, it is noteworthy that outliers exist in our thickness estimation, with a maximum relative error of 26.1% when the ground truth thickness is 12.28 mm.This error can be attributed to the ground truth thickness being much thicker than our focal depth of 5 mm.
We also demonstrated the capability of our method to correct phase aberration caused by the skull in ultrasound focusing and imaging.However, it was observed that the corrected beams had a wider beamwidth than the beam without skulls.This phenomenon might be caused by the high attenuation and the power law frequency dependence of the attenuation in the skull.With a power law exponent of 2, the high-frequency components were more strongly attenuated, which led to a wider beamwidth and introduced deteriorated lateral resolution and reduced the SNR.To address these issues, signal intensity enhancement methods such as coded excitation [51] can be incorporated with our proposed phase aberration correction method to further recover the focusing quality.Furthermore, the amplitude inversing modulation method can be used by assuming the skull as an attenuating layer close to the transducer array [11].Because the skull attenuation is frequency dependent, perhaps frequency-dependent amplitude correction could be used to recover a sharper focus.
Our DL-based approach exhibits sensitivity to noise.The MAE for thickness estimation increased from 0.15 mm (noise free) to 0.35 mm with 10-dB SNR, while the MAE for SOS estimation increased from 13 m/s (noise free) to 39 m/s with 10-dB SNR.However, the performance of phase aberration correction in transcranial focusing and transcranial imaging appeared robust under the impact of noise.
The proposed method could be practical for clinical applications due to several advantages.First, it is relatively easy to implement because it utilizes the same ultrasound array for both transcranial focusing/imaging and acquiring data for skull thickness and SOS estimation.This eliminates the need for additional CT scans of the skull.Second, our method is not location dependent and can be applied to various skull locations, not limited to the thin temporal window.Simulations performed on 3595 locations of five ex vivo human skulls with ground-truth thickness ranging from 2.8 to 11.4 mm demonstrated that our DL-based method provided accurate estimation for a wide range of skull profiles.Third, our neural network does not require large-scale data for training, as demonstrated by the sufficient RF data from five ex vivo skulls used in this study.
This study has several limitations.
1) The numerical skull models were derived from diagnostic CT images at a resolution (0.39 mm/pixel) that required image interpolation prior to acoustic simulations.Future studies could adopt higher resolution CT images (e.g., micro-CT at 0.05 mm/pixel) to reveal the finer structure of the skull in order to more accurately model the scattering occurring in the skull.Also, this study did not account for the elevation interpolation of the CT images, which could be investigated in subsequent work.
2) The method generates quasi-homogeneous SOS maps rather than providing voxel-level granularity.Taking into account the axial inhomogeneity of the SOS maps may further improve the effectiveness of phase aberration correction.
3) Low transcranial transmission is a significant challenge in transcranial ultrasound, which could impact the ultimate effectiveness of any phase aberration correction methods.This study did not address the low transmission challenge although potential solutions may be available.
4) This study considered high-contrast point targets.Imaging soft tissues through the skull could be more challenging.
5) The 2-D simulations were performed in this study for computational cost considerations.The 2-D simulations are intrinsically less realistic than 3-D simulations due to the lack of consideration of out-of-plane refraction.
6) As this study is a numerical-only feasibility proof-ofconcept study, experimental validation is needed to further evaluate the potential of this method.

V. CONCLUSION
This study demonstrated the feasibility of the deep-learningbased pulse-echo ultrasound method for transcranial phase aberration correction.The aberration correction using deeplearning-estimated skull thickness and SOS values was shown to significantly improve the quality of transcranial focusing and imaging.

Fig. 1 .
Fig. 1.(a) Representative slice of the CT scan of an ex vivo human skull.(b) Processed CT image with the skull segmented from the background.(c) SOS map derived from (b) with a 40-mm-long skull segment indicated by the red rectangular box.(d) 40-mm-long segment after rotation.

Fig. 2 .
Fig. 2. Noise levels used in the simulations.Representative RF signals obtained by simulating transcranial imaging of point targets, without noise and with noise shown in different temporal windows of (a) [0, 80 µs] and (b) [30, 50 µs].Echoes from the point targets appear around 35, 41, and 48 µs.

Fig. 3 .
Fig. 3. (a) B-mode image of a skull segment generated by the synthetic aperture approach.(b) Corresponding ground-truth SOS map of the skull segment, superimposed with a red line representing the outer surface of the skull detected by the synthetic aperture approach.

Fig. 4 .
Fig. 4. Diagram of our ultrasound data acquisition sequence to extract the skull thickness and SOS.The white bar represents the array, and active elements are marked in green.(a) Ground-truth local thickness and SOS of the scanline (the black narrow rectangle) were calculated by spatially averaging the thickness and SOS of the skull below the active elements.(b) Shifting the active elements by an index of 1 and repeating the procedure in (a) yielded the local thickness and SOS of the next scanline.(c) Combining the results from all acquisitions yielded the skull profile and SOS lateral distribution.
D. Deep-Learning-Based Skull SOS and Thickness EstimationA deep learning model was developed to estimate the local skull thickness and SOS using the frequency-domain representation of the 15 channels of RF signals.The neural network architecture (Fig.5) consisted of multiple layers of operations, starting with two linear layers, followed by two convolutional blocks, and ending with two fully connected layers.The first convolutional block consisted of a onedimensional (1-D) convolutional layer (convolution along the frequency axis), an activation layer, and a pooling layer.The second convolutional block consisted of a 1-D convolutional layer, a batch normalization (BN) layer, an activation layer, a pooling layer, and a dropout layer.Fast Fourier transform was applied to the 15 channels of RF data to yield the frequency domain representation of the signals.The deep neural network used signals in the frequency range of 0-4 MHz because signal energy outside this frequency range was negligible.The real and imaginary parts of the frequency-domain RF data were both considered, resulting in 30 channels of data (15 channels for the real part and 15 channels for the imaginary part).

Fig. 5 .
Fig. 5. Diagram of using neural networks to predict the skull thickness and SOS based on backscattered RF signals.

Fig. 6 .
Fig. 6.Scatter plots between deep-learning-estimated thickness values and ground truth, obtained from (a) noise-free RF signals and (b) noisy RF signals in the test set.Scatter plot between deep-learning-estimated SOS values and ground truth, obtained from (c) noise-free RF signals and (d) noisy RF signals in the test set.

Fig. 7 .
Fig. 7. (a) Ground-truth SOS map and (b) depth-averaged ground-truth SOS map of the skull segment.Reconstructed SOS maps of the skull segment based on detected outer surface of the skull and thickness and SOS values estimated from (c) noise-free RF signals and (d) noisy RF signals.

Fig. 8 .
Fig. 8. Profiles of beams focused at 30 mm (a) without skull, (b) with skull but no correction, (c) with skull and corrected by the ground truth SOS map, (d) with skull and corrected by the deep-learning-generated SOS map when no noise was present in RF signals, and (e) with skull and corrected by the deep-learning-estimated SOS map when noise was present in RF signals.The same dynamic range was used.

Fig. 9 .
Fig. 9. Lateral profiles of beams at the focal depth of 30 mm without skull (blue dashed curves), with skull but no correction (red dotted curves), with skull and corrected by the ground-truth SOS map (orange solid curves), and with skull and corrected by the deep-learningestimated SOS map (purple solid curves) obtained when (a) no noise was present in RF signals and (b) noise was present in RF signals.

Fig. 10 .
Fig. 10.(a) Diagram of imaging simulations.B-mode images of 12 point targets (b) without skull, (c) with skull but no correction, (d) with skull and corrected using the SOS map of a rectangular homogeneous skull with mean ground-truth thickness and SOS, (e) with skull and corrected using the ground truth SOS map, and (f) with skull and corrected using the deep-learning-estimated SOS map.(g) Axial profiles of the four targets in the central column.(h) Lateral profiles of the three targets in the third row from top.All were obtained from noise-free RF signals.
(b)]. C. Transcranial Ultrasound Imaging Phase aberration correction using the reconstructed SOS maps and FMM significantly improved the transcranial ultrasound imaging.A representative example when no noise was present in RF signals is shown in Fig. 10, comparing the B-mode images of 12-point targets under various conditions, including imaging without the skull [Fig.10(b)], imaging through a skull segment without aberration correction [Fig.10(c)], imaging through the same skull segment with aberration correction using a homogenous rectangular plate with mean ground-truth skull thickness (6.3 mm) and SOS values (2341 m/s) [Fig.10(d), hereinafter referred to as homogenous plate correction], with aberration correction using the ground-truth skull SOS map [Fig.10(e)], and with aberration correction using the SOS map reconstructed by deep-learning-estimated thickness and SOS values [Fig.10(f)].Targets at various depths and horizontal positions were accurately imaged without the skull [Fig.10(b)].When the skull was present, images of targets deviated from their original positions and were significantly distorted [Fig.10(c)].The homogenous plate correction corrected the depth shift of targets, but not the lateral shift or the degraded lateral resolution, indicating limited correction performance without

Fig. 11 .
Fig. 11.B-mode images of 12 point targets reconstructed by RF signals with noise (a) with aberration but no correction, (b) with skull and corrected using the ground truth SOS map, and (c) with skull and corrected using the deep-learning-estimated SOS map.(d) Axial profiles of the four targets in the central column.(e) Lateral profiles of the three targets in the third row from the top.
(a)], corrected using the ground-truth SOS map [Fig.11(b)], and corrected using deep learning [Fig.11(c)] were constructed with RF signals with noise.The 12 targets were clearly visible in the B-mode image with the deep-learning correction.Two more examples of phase aberration-corrected images are shown in Supplementary Materials (Figs.S4 and S5).

TABLE I PEARSON
CORRELATION COEFFICIENTS AND MEAN ABSOLUTE ERRORS FOR THICKNESS AND SOS ESTIMATION0.99 for thickness estimation in the test set.The MAE was 13 m/s, and the Pearson correlation coefficient was 0.95 for SOS estimation in the test set.For comparison, the mean value of ground-truth skull thickness was 6.3 mm, and the mean value of ground-truth skull SOS was 2341 m/s.Scatter plots between deep-learning-estimated values and the ground truth of the test set are shown in Fig.