The primary objective of Optical Chemical Structure Recognition is to identify chemical structure images into corresponding markup sequences. However, the complex two-dimensional structures of molecules, particularly those with rings and multiple branches, present significant challenges for current end-to-end methods to learn one-dimensional markup directly. To overcome this limitation, we propose a novel Ring-Free Language (RFL), which utilizes a divide-and-conquer strategy to describe chemical structures in a hierarchical form. RFL allows complex molecular structures to be decomposed into multiple parts, ensuring both uniqueness and conciseness while enhancing readability. This approach significantly reduces the learning difficulty for recognition models. Leveraging RFL, we propose a universal Molecular Skeleton Decoder (MSD), which comprises a skeleton generation module that progressively predicts the molecular skeleton and individual rings, along with a branch classification module for predicting branch information. Experimental results demonstrate that the proposed RFL and MSD can be applied to various mainstream methods, achieving superior performance compared to state-of-the-art approaches in both printed and handwritten scenarios. The code is available at https://github.com/JingMog/RFL-MSD.
翻译:光学化学结构识别的主要目标是将化学结构图像识别为相应的标记序列。然而,分子复杂的二维结构,特别是含有环状结构和多分支的结构,对当前直接从二维图像学习一维标记的端到端方法构成了重大挑战。为克服这一局限,我们提出了一种新颖的无环语言,它采用分治策略以分层形式描述化学结构。RFL允许将复杂的分子结构分解为多个部分,在确保唯一性和简洁性的同时增强了可读性。这种方法显著降低了识别模型的学习难度。基于RFL,我们提出了一种通用的分子骨架解码器,它包含一个逐步预测分子骨架和单个环的骨架生成模块,以及一个用于预测分支信息的分支分类模块。实验结果表明,所提出的RFL和MSD可应用于各种主流方法,在印刷体和手写体场景下均取得了优于现有先进方法的性能。代码可在 https://github.com/JingMog/RFL-MSD 获取。