Many natural language processing tasks, e.g., coreference resolution and semantic role labeling, require selecting text spans and making decisions about them. A typical approach to such tasks is to score all possible spans and greedily select spans for task-specific downstream processing. This approach, however, does not incorporate any inductive bias about what sort of spans ought to be selected, e.g., that selected spans tend to be syntactic constituents. In this paper, we propose a novel grammar-based structured span selection model which learns to make use of the partial span-level annotation provided for such problems. Compared to previous approaches, our approach gets rid of the heuristic greedy span selection scheme, allowing us to model the downstream task on an optimal set of spans. We evaluate our model on two popular span prediction tasks: coreference resolution and semantic role labeling. We show empirical improvements on both.
翻译:许多自然语言处理任务(例如共指消解和语义角色标注)需要选择文本跨度并对其进行决策。这类任务的主流方法是对所有可能的跨度进行评分,然后贪心地选择跨度用于任务特定的下游处理。然而,这种方法并未融入关于应选择何种跨度的归纳偏置(例如所选跨度通常为句法成分)。本文提出一种新颖的基于语法的结构化跨度选择模型,该模型能够学习利用此类问题中提供的部分跨度级标注。与先前方法相比,我们的方法摒弃了启发式贪心跨度选择机制,从而能够基于最优跨度集合建模下游任务。我们在两项主流的跨度预测任务:共指消解和语义角色标注上评估了该模型,实验结果表明在两个任务上均取得了性能提升。