In today's rapidly expanding data landscape, knowledge extraction from unstructured text is vital for real-time analytics, temporal inference, and dynamic memory frameworks. However, traditional static knowledge graph (KG) construction often overlooks the dynamic and time-sensitive nature of real-world data, limiting adaptability to continuous changes. Moreover, recent zero- or few-shot approaches that avoid domain-specific fine-tuning or reliance on prebuilt ontologies often suffer from instability across multiple runs, as well as incomplete coverage of key facts. To address these challenges, we introduce ATOM (AdapTive and OptiMized), a few-shot and scalable approach that builds and continuously updates Temporal Knowledge Graphs (TKGs) from unstructured texts. ATOM splits input documents into minimal, self-contained "atomic" facts, improving extraction exhaustivity and stability. Then, it constructs atomic TKGs from these facts, employing a dual-time modeling that distinguishes between when information is observed and when it is valid. The resulting atomic TKGs are subsequently merged in parallel. Empirical evaluations demonstrate that ATOM achieves ~18% higher exhaustivity, ~33% better stability, and over ~90% latency reduction compared to baseline methods, demonstrating a strong scalability potential for dynamic TKG construction.
翻译:在当今快速扩张的数据环境中,从非结构化文本中提取知识对于实时分析、时序推理和动态记忆框架至关重要。然而,传统的静态知识图谱构建方法常常忽视现实世界数据的动态性和时效性,限制了其对持续变化的适应能力。此外,近期避免领域特定微调或依赖预构建本体的零样本或少样本方法,通常存在多次运行间的不稳定性以及对关键事实覆盖不完整的问题。为应对这些挑战,我们提出了ATOM(自适应与优化),一种少样本且可扩展的方法,用于从非结构化文本中构建并持续更新时序知识图谱。ATOM将输入文档分割为最小化、自包含的“原子”事实,从而提高了提取的完备性和稳定性。随后,它基于这些事实构建原子时序知识图谱,采用双重时间建模来区分信息被观测的时间与其有效的时间。生成的原子时序知识图谱随后进行并行融合。实证评估表明,与基线方法相比,ATOM实现了约18%的更高完备性、约33%的更好稳定性以及超过90%的延迟降低,展现了其在动态时序知识图谱构建方面强大的可扩展潜力。