MPCM-Net: Multi-scale network integrates partial attention convolution with Mamba for ground-based cloud image segmentation

Ground-based cloud image segmentation is a critical research domain for photovoltaic power forecasting. Current deep learning approaches primarily focus on encoder-decoder architectural refinements. However, existing methodologies exhibit several limitations:(1)they rely on dilated convolutions for multi-scale context extraction, lacking the partial feature effectiveness and interoperability of inter-channel;(2)attention-based feature enhancement implementations neglect accuracy-throughput balance; and (3)the decoder modifications fail to establish global interdependencies among hierarchical local features, limiting inference efficiency. To address these challenges, we propose MPCM-Net, a Multi-scale network that integrates Partial attention Convolutions with Mamba architectures to enhance segmentation accuracy and computational efficiency. Specifically, the encoder incorporates MPAC, which comprises:(1)a MPC block with ParCM and ParSM that enables global spatial interaction across multi-scale cloud formations, and (2)a MPA block combining ParAM and ParSM to extract discriminative features with reduced computational complexity. On the decoder side, a M2B is employed to mitigate contextual loss through a SSHD that maintains linear complexity while enabling deep feature aggregation across spatial and scale dimensions. As a key contribution to the community, we also introduce and release a dataset CSRC, which is a clear-label, fine-grained segmentation benchmark designed to overcome the critical limitations of existing public datasets. Extensive experiments on CSRC demonstrate the superior performance of MPCM-Net over state-of-the-art methods, achieving an optimal balance between segmentation accuracy and inference speed. The dataset and source code will be available at https://github.com/she1110/CSRC.

翻译：地基云图分割是光伏功率预测的关键研究领域。当前的深度学习方法主要集中于编码器-解码器架构的改进。然而，现有方法存在若干局限性：(1)它们依赖空洞卷积进行多尺度上下文提取，缺乏局部特征的有效性及通道间的互操作性；(2)基于注意力的特征增强实现忽视了精度与吞吐量的平衡；(3)解码器的修改未能建立层次化局部特征间的全局相互依赖关系，限制了推理效率。为应对这些挑战，我们提出了MPCM-Net，一种集成局部注意力卷积与Mamba架构的多尺度网络，旨在提升分割精度与计算效率。具体而言，编码器集成了MPAC模块，其包含：(1)由ParCM和ParSM组成的MPC块，能够实现跨多尺度云结构的全局空间交互；(2)结合ParAM和ParSM的MPA块，以降低的计算复杂度提取判别性特征。在解码器侧，采用M2B模块通过SSHD机制缓解上下文损失，该机制在保持线性复杂度的同时，实现了跨空间与尺度维度的深度特征聚合。作为对学界的一项关键贡献，我们还引入并发布了CSRC数据集，这是一个清晰标注、细粒度分割的基准数据集，旨在克服现有公共数据集的关键局限。在CSRC上进行的大量实验表明，MPCM-Net优于现有最先进方法，在分割精度与推理速度之间实现了最佳平衡。数据集与源代码将在https://github.com/she1110/CSRC公开。