In this paper, we observe and address the challenges of the coordination recognition task. Most existing methods rely on syntactic parsers to identify the coordinators in a sentence and detect the coordination boundaries. However, state-of-the-art syntactic parsers are slow and suffer from errors, especially for long and complicated sentences. To better solve the problems, we propose a pipeline model COordination RECognizer (CoRec). It consists of two components: coordinator identifier and conjunct boundary detector. The experimental results on datasets from various domains demonstrate the effectiveness and efficiency of the proposed method. Further experiments show that CoRec positively impacts downstream tasks, improving the yield of state-of-the-art Open IE models.
翻译:本文观察并探讨了协调关系识别任务中的挑战。现有方法大多依赖句法解析器来识别句子中的连接词并检测其辖域边界。然而,先进的句法解析器处理速度缓慢,尤其在长难句中容易出错。为更好解决这些问题,我们提出了一种流水线模型CoRec(协调关系识别器),该模型包含两个组件:连接词识别器与并列边界检测器。在多个领域数据集上的实验结果表明,所提方法兼具高效性与有效性。进一步实验显示,CoRec能有效提升下游任务性能,使最先进的开放信息抽取模型产出率得到显著提高。