Document image binarization aims to separate foreground text from degraded backgrounds while preserving thin, broken, and low-contrast strokes. Although deep learning methods have improved binarization performance, most existing approaches rely on convolutional, transformer-based, or generative architectures, while Mamba-based state space models remain largely unexplored for this task. In this work, we investigate Mamba-based feature propagation and observe that direct state-space propagation may dilute weak foreground cues during long-range modeling, especially faint ink traces, fragmented characters, and boundary-sensitive stroke details. To address this problem, we propose DeepMine-Mamba, a Mamba-based binarization framework equipped with a novel Anti-Dilution Gate that estimates propagation-induced feature changes and selectively restores stroke-sensitive local responses while suppressing unnecessary background enhancement. Experiments on DIBCO/H-DIBCO benchmarks under a strict leave-one-year-out protocol show that DeepMine-Mamba achieves competitive overall performance, with strong average FM and Fps across benchmark years. Ablation results further show that the Anti-Dilution Gate is the key component for mitigating propagation-induced foreground dilution and improving stroke preservation.
翻译:文档图像二值化旨在从退化背景中分离前景文本,同时保留纤细、断裂和低对比度的笔画。尽管深度学习方法已提升了二值化性能,但现有方法大多依赖卷积、Transformer或生成式架构,而基于Mamba的状态空间模型在该任务中仍鲜有探索。本文研究了基于Mamba的特征传播过程,并观察到直接进行状态空间传播可能在长程建模中稀释弱前景线索,尤其是浅淡墨迹、断裂字符以及边界敏感的笔画细节。为解决此问题,我们提出DeepMine-Mamba——一种配备新型反稀释门的Mamba二值化框架,该门控机制可估计传播引起的特征变化,选择性恢复笔画敏感的局部响应,同时抑制不必要的背景增强。在严格留一年份验证协议下,基于DIBCO/H-DIBCO基准的实验表明,DeepMine-Mamba在整体性能上具有竞争力,在基准年份中实现了高平均FM和Fps值。消融实验进一步证明,反稀释门是缓解传播引起的前景稀释并改善笔画保留的关键组件。