Entity Resolution (ER) is the problem of determining when two entities refer to the same underlying entity. The problem has been studied for over 50 years, and most recently, has taken on new importance in an era of large, heterogeneous 'knowledge graphs' published on the Web and used widely in domains as wide ranging as social media, e-commerce and search. This chapter will discuss the specific problem of named ER in the context of personal knowledge graphs (PKGs). We begin with a formal definition of the problem, and the components necessary for doing high-quality and efficient ER. We also discuss some challenges that are expected to arise for Web-scale data. Next, we provide a brief literature review, with a special focus on how existing techniques can potentially apply to PKGs. We conclude the chapter by covering some applications, as well as promising directions for future research.
翻译:实体消解(Entity Resolution, ER)旨在判定两个实体是否指向同一真实世界实体。该问题已有超过五十年的研究历史,并在大规模异构"知识图谱"时代焕发新活力——这些图谱广泛发布于互联网,深度应用于社交媒体、电子商务和搜索引擎等多元领域。本章聚焦个人知识图谱(Personal Knowledge Graphs, PKGs)中的命名实体消解特有问题。我们首先给出问题的形式化定义,并阐述实现高质量高效ER的必备组件,同时讨论网络规模数据可能面临的挑战。继而进行文献综述,重点分析现有技术对PKGs的潜在适用性。最后通过应用案例与未来研究方向展望为本章收尾。