Event detection is a crucial information extraction task in many domains, such as Wikipedia or news. The task typically relies on trigger detection (TD) -- identifying token spans in the text that evoke specific events. While the notion of triggers should ideally be universal across domains, domain transfer for TD from high- to low-resource domains results in significant performance drops. We address the problem of negative transfer in TD by coupling triggers between domains using subject-object relations obtained from a rule-based open information extraction (OIE) system. We demonstrate that OIE relations injected through multi-task training can act as mediators between triggers in different domains, enhancing zero- and few-shot TD domain transfer and reducing performance drops, in particular when transferring from a high-resource source domain (Wikipedia) to a low(er)-resource target domain (news). Additionally, we combine this improved transfer with masked language modeling on the target domain, observing further TD transfer gains. Finally, we demonstrate that the gains are robust to the choice of the OIE system.
翻译:事件检测是许多领域(如维基百科或新闻)中的关键信息抽取任务。该任务通常依赖于触发词检测(TD)——识别文本中引发特定事件的标记跨度。尽管触发词的概念理论上应跨领域通用,但TD从高资源领域向低资源领域迁移时会出现显著的性能下降。我们通过基于规则的开源信息抽取(OIE)系统获取的主客体关系,将不同领域的触发词耦合起来,从而解决TD中的负迁移问题。我们证明,通过多任务训练注入的OIE关系可作为不同领域触发词之间的中介,增强零样本和少样本TD领域迁移,并减少性能下降,尤其在从高资源源领域(维基百科)向低资源目标领域(新闻)迁移时效果显著。此外,我们将这种改进的迁移与目标领域的掩码语言建模相结合,观察到TD迁移性能进一步提升。最后,我们证明这些提升对OIE系统的选择具有鲁棒性。