Word Sense Disambiguation (WSD) is the task of associating a word in a given context with its most suitable meaning among a set of possible candidates. While the task has recently witnessed renewed interest, with systems achieving performances above the estimated inter-annotator agreement, at the time of writing it still struggles to find downstream applications. We argue that one of the reasons behind this is the difficulty of applying WSD to plain text. Indeed, in the standard formulation, models work under the assumptions that a) all the spans to disambiguate have already been identified, and b) all the possible candidate senses of each span are provided, both of which are requirements that are far from trivial. In this work, we present a new task called Word Sense Linking (WSL) where, given an input text and a reference sense inventory, systems have to both identify which spans to disambiguate and then link them to their most suitable meaning.We put forward a transformer-based architecture for the task and thoroughly evaluate both its performance and those of state-of-the-art WSD systems scaled to WSL, iteratively relaxing the assumptions of WSD. We hope that our work will foster easier integration of lexical semantics into downstream applications.
翻译:词义消歧(Word Sense Disambiguation, WSD)的任务是在给定上下文中,将一个词与其最合适的含义从一组候选意义中关联起来。尽管该任务近期重新受到关注,且系统性能已超过估计的标注者间一致性水平,但在撰写本文时,其仍难以找到下游应用。我们认为,这背后的原因之一在于将WSD应用于纯文本的困难性。实际上,在标准框架下,模型基于以下假设工作:a) 所有待消歧的文本片段已被识别,b) 每个片段的所有可能候选词义均已提供——这两项要求均非易事。在本研究中,我们提出了一项称为词义链接(Word Sense Linking, WSL)的新任务:给定输入文本和参考词义库,系统必须同时识别哪些片段需要消歧,并将其链接至最合适的含义。我们为此任务提出了一种基于Transformer的架构,并全面评估了其性能以及扩展到WSL任务的先进WSD系统的表现,逐步放宽了WSD的假设条件。我们希望这项工作能促进词汇语义学更便捷地融入下游应用。