Learning to Evolve: Bayesian-Guided Continual Knowledge Graph Embedding

As social media and the World Wide Web become hubs for information dissemination, effectively organizing and understanding the vast amounts of dynamically evolving Web content is crucial. Knowledge graphs (KGs) provide a powerful framework for structuring this information. However, the rapid emergence of new hot topics, user relationships, and events in social media renders traditional static knowledge graph embedding (KGE) models rapidly outdated. Continual Knowledge Graph Embedding (CKGE) aims to address this issue, but existing methods commonly suffer from catastrophic forgetting, whereby older, but still valuable, information is lost when learning new knowledge (such as new memes or trending events). This means the model cannot effectively learn the evolution of the data. We propose a novel CKGE framework, BAKE. Unlike existing methods, BAKE formulates CKGE as a sequential Bayesian inference problem and utilizes the Bayesian posterior update principle as a natural continual learning strategy. This principle is insensitive to data order and provides theoretical guarantees to preserve prior knowledge as much as possible. Specifically, we treat each batch of new data as a Bayesian update to the model's prior. By maintaining the posterior distribution, the model effectively preserves earlier knowledge even as it evolves over multiple snapshots. Furthermore, to constrain the evolution of knowledge across snapshots, we introduce a continual clustering method that maintains the compact cluster structure of entity embeddings through a regularization term, ensuring semantic consistency while allowing controlled adaptation to new knowledge. We conduct extensive experiments on multiple CKGE benchmarks, which demonstrate that BAKE achieves the top performance in the vast majority of cases compared to existing approaches.

翻译：随着社交媒体和万维网成为信息传播的中心，有效组织和理解海量动态演变的网络内容变得至关重要。知识图谱为结构化这些信息提供了强大的框架。然而，社交媒体中不断涌现的新热点话题、用户关系和事件使得传统的静态知识图谱嵌入模型迅速过时。持续知识图谱嵌入旨在解决这一问题，但现有方法普遍存在灾难性遗忘问题，即在习得新知识（如新出现的网络迷因或热点事件）时，较早但仍具价值的信息会丢失。这意味着模型无法有效学习数据的演化规律。我们提出了一种新颖的持续知识图谱嵌入框架BAKE。与现有方法不同，BAKE将持续知识图谱嵌入建模为序列贝叶斯推断问题，并利用贝叶斯后验更新原理作为自然的持续学习策略。该原理对数据顺序不敏感，并为尽可能保留先验知识提供了理论保证。具体而言，我们将每批新数据视为对模型先验的贝叶斯更新。通过维护后验分布，模型在经历多个快照演化的过程中仍能有效保留早期知识。此外，为约束知识在快照间的演化，我们引入了一种持续聚类方法，通过正则化项保持实体嵌入的紧凑聚类结构，在确保语义一致性的同时，允许对新知识进行受控适应。我们在多个持续知识图谱嵌入基准上进行了广泛实验，结果表明相较于现有方法，BAKE在绝大多数情况下均取得了最优性能。