Multi-modal knowledge graphs have emerged as a powerful approach for information representation, combining data from different modalities such as text, images, and videos. While several such graphs have been constructed and have played important roles in applications like visual question answering and recommendation systems, challenges persist in their development. These include the scarcity of high-quality Chinese knowledge graphs and limited domain coverage in existing multi-modal knowledge graphs. This paper introduces MMPKUBase, a robust and extensive Chinese multi-modal knowledge graph that covers diverse domains, including birds, mammals, ferns, and more, comprising over 50,000 entities and over 1 million filtered images. To ensure data quality, we employ Prototypical Contrastive Learning and the Isolation Forest algorithm to refine the image data. Additionally, we have developed a user-friendly platform to facilitate image attribute exploration.
翻译:多模态知识图谱作为一种强大的信息表示方法,整合了文本、图像和视频等不同模态的数据。尽管已有多个此类图谱被构建出来,并在视觉问答和推荐系统等应用中发挥了重要作用,但其发展仍面临挑战。这些挑战包括高质量中文知识图谱的稀缺性,以及现有多模态知识图谱的领域覆盖范围有限。本文介绍了MMPKUBase,这是一个健壮且广泛的中文多模态知识图谱,覆盖了鸟类、哺乳动物、蕨类植物等多个领域,包含超过50,000个实体和超过100万张经过筛选的图像。为确保数据质量,我们采用原型对比学习和孤立森林算法对图像数据进行精炼。此外,我们还开发了一个用户友好的平台,以方便进行图像属性探索。