Few-shot event detection (ED) has been widely studied, while this brings noticeable discrepancies, e.g., various motivations, tasks, and experimental settings, that hinder the understanding of models for future progress. This paper presents a thorough empirical study, a unified view of ED models, and a better unified baseline. For fair evaluation, we choose two practical settings: low-resource setting to assess generalization ability and class-transfer setting for transferability. We compare ten representative methods on three datasets, which are roughly grouped into prompt-based and prototype-based models for detailed analysis. To investigate the superior performance of prototype-based methods, we break down the design and build a unified framework. Based on that, we not only propose a simple yet effective method (e.g., 2.7% F1 gains under low-resource setting) but also offer many valuable research insights for future research.
翻译:少样本事件检测已得到广泛研究,但其在动机、任务和实验设置上的显著差异阻碍了对模型的深入理解及未来发展。本文进行了全面的实证研究,提出了事件检测模型的统一视角,并构建了更优的统一基线。为实现公平评估,我们选取了两种实用设置:低资源设置以评估泛化能力,类别迁移设置以评估可迁移性。我们在三个数据集上比较了十种代表性方法,将其大致分为基于提示和基于原型两类模型进行详细分析。为探究基于原型方法的优异性能,我们分解其设计并构建统一框架,据此不仅提出一种简单有效的方法(例如在低资源设置下F1值提升2.7%),还为未来研究提供了诸多有价值的研究洞见。