In pediatric cardiology, the accurate and immediate assessment of cardiac function through echocardiography is crucial since it can determine whether urgent intervention is required in many emergencies. However, echocardiography is characterized by ambiguity and heavy background noise interference, causing more difficulty in accurate segmentation. Present methods lack efficiency and are prone to mistakenly segmenting some background noise areas, such as the left ventricular area, due to noise disturbance. To address these issues, we introduce P-Mamba, which integrates the Mixture of Experts (MoE) concept for efficient pediatric echocardiographic left ventricular segmentation. Specifically, we utilize the recently proposed ViM layers from the vision mamba to enhance our model's computational and memory efficiency while modeling global dependencies.In the DWT-based Perona-Malik Diffusion (PMD) Block, we devise a PMD Block for noise suppression while preserving the left ventricle's local shape cues. Consequently, our proposed P-Mamba innovatively combines the PMD's noise suppression and local feature extraction capabilities with Mamba's efficient design for global dependency modeling. We conducted segmentation experiments on two pediatric ultrasound datasets and a general ultrasound dataset, namely Echonet-dynamic, and achieved state-of-the-art (SOTA) results. Leveraging the strengths of the P-Mamba block, our model demonstrates superior accuracy and efficiency compared to established models, including vision transformers with quadratic and linear computational complexity.
翻译:在儿科心脏病学中,通过超声心动图准确、即时地评估心脏功能至关重要,因其在许多紧急情况下可决定是否需要紧急干预。然而,超声心动图具有模糊性和严重背景噪声干扰的特点,导致精确分割更为困难。现有方法效率不足,且易因噪声干扰而误分割部分背景噪声区域(如左心室区域)。为解决这些问题,我们提出了P-Mamba,该方法融合了专家混合(MoE)概念,用于高效的儿科超声心动图左心室分割。具体而言,我们利用视觉Mamba中最新提出的ViM层,在建模全局依赖关系的同时提升模型的计算与内存效率。在基于离散小波变换的Perona-Malik扩散(PMD)模块中,我们设计了PMD模块以抑制噪声,同时保留左心室的局部形状线索。因此,我们提出的P-Mamba创新性地结合了PMD的噪声抑制与局部特征提取能力,以及Mamba用于全局依赖建模的高效设计。我们在两个儿科超声数据集和一个通用超声数据集(即Echonet-dynamic)上进行了分割实验,并取得了最先进的(SOTA)结果。借助P-Mamba模块的优势,我们的模型相较于现有模型(包括具有二次和线性计算复杂度的视觉Transformer)展现出更优的准确性与效率。