Knowledge Graphs (KGs) are a powerful representation of linked data, offering flexibility, semantic richness, and support for knowledge enrichment and reasoning. They help data owners organize and exploit heterogeneous data to provide insightful services (e.g., recommendations), yet real-world KGs are often incomplete, hiding true facts or missing valuable insights. Knowledge graph embedding techniques are commonly used to infer valuable missing information. However, reasoning over KGs can inadvertently expose sensitive user information, even when such data is not explicitly stored. In this work, we investigate the privacy risks associated with KGE-based reasoning, focusing on attribute inference attacks where adversaries attempt to deduce sensitive user attributes from seemingly non-sensitive outputs. We propose and evaluate a framework that mitigates these privacy risks by applying post processing sanitization techniques to KGE outputs. Preliminary results demonstrate the effectiveness of these attacks on the outputs of KGE models, and explore the trade-off between recommendation quality and privacy protection when applying randomization based approaches, highlighting the need to experiment with more advanced techniques in future work to address this issue.
翻译:知识图谱(KG)是链接数据的一种强大表示形式,具有灵活性、语义丰富性,并支持知识扩充与推理。它们帮助数据所有者组织和利用异构数据以提供富有洞察力的服务(例如推荐),然而现实世界的知识图谱通常是不完整的,要么隐藏了真实事实,要么缺少有价值的见解。知识图谱嵌入技术常被用于推断缺失的重要信息。然而,基于知识图谱进行推理可能会无意中暴露敏感的用户信息,即便此类数据并未显式存储。本文研究了基于知识图谱嵌入推理带来的隐私风险,重点关注属性推断攻击——即攻击者试图从看似非敏感的输出中推断用户的敏感属性。我们提出并评估了一个通过向知识图谱嵌入输出应用后处理净化技术来缓解这些隐私风险的框架。初步结果证明了这些攻击对知识图谱嵌入模型输出的有效性,并探讨了在应用基于随机化的方法时推荐质量与隐私保护之间的权衡,强调了未来工作中需要尝试更先进的技术以解决这一问题。