Instance segmentation for low-light imagery remains largely unexplored due to the challenges imposed by such conditions, for example shot noise due to low photon count, color distortions and reduced contrast. In this paper, we propose an end-to-end solution to address this challenging task. Based on Mask R-CNN, our proposed method implements weighted non-local (NL) blocks in the feature extractor. This integration enables an inherent denoising process at the feature level. As a result, our method eliminates the need for aligned ground truth images during training, thus supporting training on real-world low-light datasets. We introduce additional learnable weights at each layer in order to enhance the network's adaptability to real-world noise characteristics, which affect different feature scales in different ways. Experimental results show that the proposed method outperforms the pretrained Mask R-CNN with an Average Precision (AP) improvement of +10.0, with the introduction of weighted NL Blocks further enhancing AP by +1.0.
翻译:低光照环境下的实例分割任务因低光子计数导致的散粒噪声、颜色失真和对比度降低等挑战,至今仍鲜有研究。本文提出一种端到端解决方案,用于解决这一难题。基于Mask R-CNN框架,我们在特征提取器中引入加权非局部(Non-Local, NL)模块,该集成实现了特征层面的固有去噪过程。由于这一特性,我们的方法在训练过程中无需对齐的真实图像作为监督信号,从而支持在真实低光照数据集上进行训练。我们为每个网络层引入额外的可学习权重,以增强模型对不同特征尺度上真实噪声特性的适应能力。实验结果表明,与预训练Mask R-CNN相比,所提方法的平均精度(Average Precision, AP)提升10.0个点,而加权非局部模块的引入进一步将AP提升1.0个点。