In Natural Language Processing, entity linking (EL) has centered around Wikipedia, but yet remains underexplored for the job market domain. Disambiguating skill mentions can help us get insight into the current labor market demands. In this work, we are the first to explore EL in this domain, specifically targeting the linkage of occupational skills to the ESCO taxonomy (le Vrang et al., 2014). Previous efforts linked coarse-grained (full) sentences to a corresponding ESCO skill. In this work, we link more fine-grained span-level mentions of skills. We tune two high-performing neural EL models, a bi-encoder (Wu et al., 2020) and an autoregressive model (Cao et al., 2021), on a synthetically generated mention--skill pair dataset and evaluate them on a human-annotated skill-linking benchmark. Our findings reveal that both models are capable of linking implicit mentions of skills to their correct taxonomy counterparts. Empirically, BLINK outperforms GENRE in strict evaluation, but GENRE performs better in loose evaluation (accuracy@$k$).
翻译:在自然语言处理中,实体链接(EL)主要围绕维基百科展开,但在就业市场领域仍鲜有探索。消歧技能提及有助于洞察当前劳动力市场需求。本研究首次探索该领域的实体链接,专门针对职业技能与ESCO分类体系(le Vrang等,2014)的关联。以往工作将粗粒度(完整)句子链接至对应ESCO技能,而本研究则对更细粒度的片段级技能提及进行链接。我们基于合成生成的提及-技能对数据集,调优了两种高性能神经实体链接模型——双编码器模型(Wu等,2020)与自回归模型(Cao等,2021),并在人工标注的技能链接基准上进行评估。研究结果表明,两种模型均能成功将隐式技能提及链接至正确的分类体系条目。从实证角度看,BLINK在严格评估中优于GENRE,而GENRE在宽松评估(accuracy@$k$)中表现更佳。