In this work, we present DeepEraser, an effective deep network for generic text removal. DeepEraser utilizes a recurrent architecture that erases the text in an image via iterative operations. Our idea comes from the process of erasing pencil script, where the text area designated for removal is subject to continuous monitoring and the text is attenuated progressively, ensuring a thorough and clean erasure. Technically, at each iteration, an innovative erasing module is deployed, which not only explicitly aggregates the previous erasing progress but also mines additional semantic context to erase the target text. Through iterative refinements, the text regions are progressively replaced with more appropriate content and finally converge to a relatively accurate status. Furthermore, a custom mask generation strategy is introduced to improve the capability of DeepEraser for adaptive text removal, as opposed to indiscriminately removing all the text in an image. Our DeepEraser is notably compact with only 1.4M parameters and trained in an end-to-end manner. To verify its effectiveness, extensive experiments are conducted on several prevalent benchmarks, including SCUT-Syn, SCUT-EnsText, and Oxford Synthetic text dataset. The quantitative and qualitative results demonstrate the effectiveness of our DeepEraser over the state-of-the-art methods, as well as its strong generalization ability in custom mask text removal. The codes and pre-trained models are available at https://github.com/fh2019ustc/DeepEraser
翻译:在本文中,我们提出DeepEraser,一种用于通用文本移除的高效深度网络。DeepEraser采用循环架构,通过迭代操作擦除图像中的文本。我们的灵感源于擦除铅笔字迹的过程:被标记移除的文本区域受到持续监测,并逐步衰减,直至彻底干净地擦除。在技术层面,每次迭代都会部署一个创新的擦除模块,该模块不仅显式聚合先前的擦除进度,还挖掘额外的语义上下文以消除目标文本。通过迭代优化,文本区域逐渐被更合适的内容替代,最终收敛至相对精确的状态。此外,我们引入了一种自定义掩码生成策略,使DeepEraser具备自适应文本移除能力,而非无差别地移除图像中所有文本。DeepEraser模型极为紧凑,仅含1.4M参数,并以端到端方式训练。为验证其有效性,我们在多个主流基准数据集(包括SCUT-Syn、SCUT-EnsText和Oxford合成文本数据集)上进行了广泛实验。定性与定量结果表明,DeepEraser在性能上优于现有最先进方法,并且在自定义掩码文本移除任务中展现出强大的泛化能力。相关代码与预训练模型已开源至https://github.com/fh2019ustc/DeepEraser。