Over the past years, Printed Mathematical Expression Recognition (PMER) has progressed rapidly. However, due to the insufficient context information captured by Convolutional Neural Networks, some mathematical symbols might be incorrectly recognized or missed. To tackle this problem, in this paper, a Dual Branch transformer-based Network (DBN) is proposed to learn both local and global context information for accurate PMER. In our DBN, local and global features are extracted simultaneously, and a Context Coupling Module (CCM) is developed to complement the features between the global and local contexts. CCM adopts an interactive manner so that the coupled context clues are highly correlated to each expression symbol. Additionally, we design a Dynamic Soft Target (DST) strategy to utilize the similarities among symbol categories for reasonable label generation. Our experimental results have demonstrated that DBN can accurately recognize mathematical expressions and has achieved state-of-the-art performance.
翻译:近年来,印刷体数学表达式识别(Printed Mathematical Expression Recognition, PMER)取得了快速发展。然而,由于卷积神经网络捕获的上下文信息不足,部分数学符号可能被错误识别或遗漏。为解决此问题,本文提出了一种基于双分支Transformer的网络(Dual Branch Network, DBN),通过同时学习局部和全局上下文信息实现高精度PMER。在DBN中,局部特征与全局特征被同步提取,并设计了上下文耦合模块(Context Coupling Module, CCM)以实现全局与局部上下文间的特征互补。CCM采用交互式机制,使得耦合后的上下文线索与每个表达式符号高度相关。此外,我们设计了一种动态软目标(Dynamic Soft Target, DST)策略,利用符号类别间的相似性生成合理的标签。实验结果表明,DBN能够准确识别数学表达式,并取得了当前最优性能。