Arbitrary shape scene text detection is of great importance in scene understanding tasks. Due to the complexity and diversity of text in natural scenes, existing scene text algorithms have limited accuracy for detecting arbitrary shape text. In this paper, we propose a novel arbitrary shape scene text detector through boundary points dynamic optimization(BPDO). The proposed model is designed with a text aware module (TAM) and a boundary point dynamic optimization module (DOM). Specifically, the model designs a text aware module based on segmentation to obtain boundary points describing the central region of the text by extracting a priori information about the text region. Then, based on the idea of deformable attention, it proposes a dynamic optimization model for boundary points, which gradually optimizes the exact position of the boundary points based on the information of the adjacent region of each boundary point. Experiments on CTW-1500, Total-Text, and MSRA-TD500 datasets show that the model proposed in this paper achieves a performance that is better than or comparable to the state-of-the-art algorithm, proving the effectiveness of the model.
翻译:任意形状场景文本检测在场景理解任务中具有重要价值。由于自然场景中文本的复杂性和多样性,现有场景文本算法对任意形状文本的检测精度有限。本文提出一种通过边界点动态优化(BPDO)的新型任意形状场景文本检测器。所提模型包含文本感知模块(TAM)和边界点动态优化模块(DOM)。具体而言,模型基于分割设计文本感知模块,通过提取文本区域先验信息获取描述文本中心区域的边界点;随后基于可变形注意力思想提出边界点动态优化模块,根据每个边界点的邻域信息逐步优化边界点的精确位置。在CTW-1500、Total-Text和MSRA-TD500数据集上的实验表明,本文所提模型性能优于或可比肩当前最优算法,验证了模型的有效性。