ASF-LKUNet: Adjacent-Scale Fusion U-Net with Large-kernel for Medical Image Segmentation
In this paper, we propose an adjacent-scale fusion 2.5D U-Net with large-kernel (ASF-LKUNet) for multi-class medical image segmentation tasks. To reduce model complexity, we utilize a u-shaped encoder-decoder as the base architecture of ASF-LKUNet. In the encoder path, we design the large-kernel residual block, which combines the large and small kernels and can simultaneously capture the global and local features while retaining the advantages of ViT. Furthermore, we develop an adjacent-scale GRN channel attention mechanism that incorporates the low-level details with the high-level semantics by fusing the feature of adjacent scales. The adaptive fusion is implemented by the improved large-kernel channel attention based on global response normalization (GRN). In ASF-LKUNet, all the large-kernel apply depth-wise convolutions to further reduce the complexity. Our proposed method is compared with ten other methods, including those based on UNets, multi-scale fusion, 3D CNN, and ViTs. Extensive experiments of performance and interpretability analysis show that ASF-LKUNet outperforms various competing methods with less model complexity on different medical applications, including multi-organ segmentation in CT images and cardiac multi-structure segmentation in MRI images.
Email Address of Submitting Authorrfwang@xidian.edu.cn
Submitting Author's InstitutionXidian University
Submitting Author's Country