Since Pretrained Language Models (PLMs) are the cornerstone of the most recent Information Retrieval (IR) models, the way they encode semantic knowledge is particularly important. However, little attention has been given to studying the PLMs' capability to capture hierarchical semantic knowledge. Traditionally, evaluating such knowledge encoded in PLMs relies on their performance on a task-dependent evaluation approach based on proxy tasks, such as hypernymy detection. Unfortunately, this approach potentially ignores other implicit and complex taxonomic relations. In this work, we propose a task-agnostic evaluation method able to evaluate to what extent PLMs can capture complex taxonomy relations, such as ancestors and siblings. The evaluation is based on intrinsic properties that capture the hierarchical nature of taxonomies. Our experimental evaluation shows that the lexico-semantic knowledge implicitly encoded in PLMs does not always capture hierarchical relations. We further demonstrate that the proposed properties can be injected into PLMs to improve their understanding of hierarchy. Through evaluations on taxonomy reconstruction, hypernym discovery and reading comprehension tasks, we show that the knowledge about hierarchy is moderately but not systematically transferable across tasks.
翻译:预训练语言模型(PLMs)是当前信息检索(IR)模型的基石,因此其编码语义知识的方式尤为重要。然而,鲜有研究关注PLMs捕获层级语义知识的能力。传统上,评估PLMs中编码的此类知识依赖于基于代理任务(如上下位关系检测)的任务依赖型评估方法。遗憾的是,该方法可能忽略其他隐式且复杂的分类关系。本研究提出一种任务无关的评估方法,能够评估PLMs在多大程度上捕获祖先关系和兄弟关系等复杂分类关系。该评估基于捕获分类层级本质的内在属性。实验评估表明,PLMs隐式编码的词汇语义知识并非总能捕获层级关系。我们进一步证明,可将所提出的属性注入PLMs以提升其对层级结构的理解。通过分类重建、上下位关系发现和阅读理解任务的评估,我们发现层级相关知识虽具有中等迁移性,但并非在所有任务间系统可迁移。