Event detection is a crucial information extraction task in many domains, such as Wikipedia or news. The task typically relies on trigger detection (TD) -- identifying token spans in the text that evoke specific events. While the notion of triggers should ideally be universal across domains, domain transfer for TD from high- to low-resource domains results in significant performance drops. We address the problem of negative transfer for TD by coupling triggers between domains using subject-object relations obtained from a rule-based open information extraction (OIE) system. We demonstrate that relations injected through multi-task training can act as mediators between triggers in different domains, enhancing zero- and few-shot TD domain transfer and reducing negative transfer, in particular when transferring from a high-resource source Wikipedia domain to a low-resource target news domain. Additionally, we combine the extracted relations with masked language modeling on the target domain and obtain further TD performance gains. Finally, we demonstrate that the results are robust to the choice of the OIE system.
翻译:事件检测是许多领域(如维基百科或新闻)中一项关键的信息抽取任务。该任务通常依赖于触发词检测——识别文本中能唤起特定事件的标记片段。虽然触发词的概念在理想情况下应跨领域通用,但从高资源领域向低资源领域进行触发词检测的领域迁移会导致显著的性能下降。我们通过基于规则的开放信息抽取系统获得的主题-宾语关系,在领域间耦合触发词,从而解决触发词检测的负迁移问题。我们证明,通过多任务训练注入的关系可以在不同领域的触发词之间充当媒介,增强零样本和少样本触发词检测的领域迁移能力,并减少负迁移,尤其是从高资源源领域(维基百科)向低资源目标领域(新闻)迁移时。此外,我们将抽取的关系与目标领域的掩码语言建模相结合,进一步提升了触发词检测性能。最后,我们证明这些结果对开放信息抽取系统的选择具有鲁棒性。