Evaluating semantic relatedness of Web resources is still an open challenge. This paper focuses on knowledge-based methods, which represent an alternative to corpus-based approaches, and rely in general on the availability of knowledge graphs. In particular, we have selected 10 methods from the existing literature, that have been organized according to it adjacent resources, triple patterns, and triple weights-based methods. They have been implemented and evaluated by using DBpedia as reference RDF knowledge graph. Since DBpedia is continuously evolving, the experimental results provided by these methods in the literature are not comparable. For this reason, in this work, such methods have been experimented by running them all at once on the same DBpedia release and against 14 well-known golden datasets. On the basis of the correlation values with human judgment obtained according to the experimental results, weighting the RDF triples in combination with evaluating all the directed paths linking the compared resources is the best strategy in order to compute semantic relatedness in DBpedia.
翻译:评估网络资源的语义相关性仍是一个开放挑战。本文聚焦于基于知识的方法——作为语料库方法的替代方案,这类方法通常依赖知识图谱的可用性。具体而言,我们从现有文献中选取了10种方法,将其归纳为邻接资源法、三元组模式法和三元组权重法三类。这些方法均以DBpedia作为参考RDF知识图谱进行实现与评估。由于DBpedia持续演进,现有文献中这些方法的实验结果缺乏可比性。为此,本研究在同一DBpedia版本上对所有方法进行同步实验,并针对14个公认黄金标准数据集进行测试。根据实验结果与人工判断的相关系数,结合对连接所比较资源的所有有向路径的评估,对RDF三元组进行加权处理是计算DBpedia语义相关性的最优策略。