Despite recent advancements in semantic segmentation, where and what pixels are hard to segment remains largely unexplored. Existing research only separates an image into easy and hard regions and empirically observes the latter are associated with object boundaries. In this paper, we conduct a comprehensive analysis of hard pixel errors, categorizing them into three types: false responses, merging mistakes, and displacements. Our findings reveal a quantitative association between hard pixels and aliasing, which is distortion caused by the overlapping of frequency components in the Fourier domain during downsampling. To identify the frequencies responsible for aliasing, we propose using the equivalent sampling rate to calculate the Nyquist frequency, which marks the threshold for aliasing. Then, we introduce the aliasing score as a metric to quantify the extent of aliasing. While positively correlated with the proposed aliasing score, three types of hard pixels exhibit different patterns. Here, we propose two novel de-aliasing filter (DAF) and frequency mixing (FreqMix) modules to alleviate aliasing degradation by accurately removing or adjusting frequencies higher than the Nyquist frequency. The DAF precisely removes the frequencies responsible for aliasing before downsampling, while the FreqMix dynamically selects high-frequency components within the encoder block. Experimental results demonstrate consistent improvements in semantic segmentation and low-light instance segmentation tasks. The code is available at: https://github.com/Linwei-Chen/Seg-Aliasing.
翻译:尽管语义分割近期取得了进展,但哪些像素难以分割以及为何难以分割仍未被充分探索。现有研究仅将图像划分为简单区域和困难区域,并通过经验观察发现后者与物体边界相关。本文对硬像素误差进行了系统分析,将其归为三类:虚假响应、合并错误和位移。我们的发现揭示了硬像素与混叠之间的量化关联——混叠是由于下采样过程中频域分量重叠引起的失真。为识别导致混叠的频率,我们提出利用等效采样率计算奈奎斯特频率(混叠阈值),进而引入混叠分数作为量化混叠程度的指标。三类硬像素虽与所提混叠分数呈正相关,但表现出不同模式。为此,我们提出两种新型模块:去混叠滤波器(DAF)和频率混合(FreqMix),通过精确移除或调整高于奈奎斯特频率的分量来缓解混叠退化。DAF在下采样前精准消除致混频率,而FreqMix在编码器模块内动态选择高频分量。实验结果表明,该方法在语义分割和低光照实例分割任务中均取得一致改进。代码开源地址:https://github.com/Linwei-Chen/Seg-Aliasing。