Event extraction aims to recognize pre-defined event triggers and arguments from texts, which suffer from the lack of high-quality annotations. In most NLP applications, involving a large scale of synthetic training data is a practical and effective approach to alleviate the problem of data scarcity. However, when applying to the task of event extraction, recent data augmentation methods often neglect the problem of grammatical incorrectness, structure misalignment, and semantic drifting, leading to unsatisfactory performances. In order to solve these problems, we propose a denoised structure-to-text augmentation framework for event extraction DAEE, which generates additional training data through the knowledge-based structure-to-text generation model and selects the effective subset from the generated data iteratively with a deep reinforcement learning agent. Experimental results on several datasets demonstrate that the proposed method generates more diverse text representations for event extraction and achieves comparable results with the state-of-the-art.
翻译:事件抽取旨在从文本中识别预定义的事件触发词和论元,但目前面临高质量标注数据匮乏的问题。在大多数自然语言处理应用中,引入大规模合成训练数据是缓解数据稀缺问题的有效实用方法。然而,当应用于事件抽取任务时,现有的数据增强方法往往忽略了语法正确性、结构对齐和语义漂移等问题,导致性能不佳。为解决这些问题,本文提出了一种面向事件抽取的去噪结构到文本增强框架DAEE,该框架通过基于知识的结构到文本生成模型生成额外训练数据,并利用深度强化学习智能体从生成数据中迭代地选择有效子集。在多个数据集上的实验结果表明,所提方法能够为事件抽取生成更多样化的文本表示,并取得了与现有最优方法可比的性能。