This paper presents ERN-Net, an Evolving Reason Node-Net for efficient document image binarization. ERN-Net enhances degradation-sensitive regions, such as faint strokes, broken characters, and noisy backgrounds, through evolving reason nodes and multi-scale reasoning. We further compare ResNet-101, ConvNeXt-Tiny, and ConvNeXt-Base, and find that ConvNeXt-Tiny provides the best practical trade-off between accuracy and memory usage. In addition, DIBCO-based pretraining improves binarization performance without increasing model memory consumption, requiring only about 1.5 additional training hours. Experiments on DIBCO-style benchmarks show that ERN-Net is effective under low-data and low-memory settings.
翻译:本文提出了ERN-Net——一种用于高效文档图像二值化的演化推理节点网络。ERN-Net通过演化推理节点与多尺度推理机制,增强了对退化敏感区域(如淡笔划、断裂字符及噪声背景)的处理能力。我们进一步对比了ResNet-101、ConvNeXt-Tiny与ConvNeXt-Base,发现ConvNeXt-Tiny在准确率与内存占用之间取得了最优的实用平衡。此外,基于DIBCO的预训练在不增加模型内存消耗的前提下提升了二值化性能,仅需额外约1.5小时的训练时间。在DIBCO系列基准测试上的实验表明,ERN-Net在低数据量与低内存设置下依然有效。