This paper presents a novel approach for constructing associative knowledge graphs that are highly effective for storing and recognizing sequences. The graph is created by representing overlapping sequences of objects, as tightly connected clusters within the larger graph. Individual objects (represented as nodes) can be a part of multiple sequences or appear repeatedly within a single sequence. To retrieve sequences, we leverage context, providing a subset of objects that triggers an association with the complete sequence. The system's memory capacity is determined by the size of the graph and the density of its connections. We have theoretically derived the relationships between the critical density of the graph and the memory capacity for storing sequences. The critical density is the point beyond which error-free sequence reconstruction becomes impossible. Furthermore, we have developed an efficient algorithm for ordering elements within a sequence. Through extensive experiments with various types of sequences, we have confirmed the validity of these relationships. This approach has potential applications in diverse fields, such as anomaly detection in financial transactions or predicting user behavior based on past actions.
翻译:本文提出了一种构建关联知识图谱的新方法,该方法在存储和识别序列方面具有高效性。该图谱通过将对象的重叠序列表示为较大图谱中的紧密连接簇来构建。单个对象(表示为节点)可以是多个序列的一部分,也可以在单个序列中重复出现。为了检索序列,我们利用上下文,提供触发与完整序列关联的对象子集。系统的存储容量由图谱的大小及其连接的密度决定。我们从理论上推导了图谱的临界密度与存储序列的存储容量之间的关系。临界密度是指超过该点后无法实现无误差序列重建的阈值。此外,我们开发了一种用于对序列中元素进行排序的高效算法。通过对多种类型序列的大量实验,我们验证了这些关系的有效性。该方法在多个领域具有潜在应用价值,例如金融交易中的异常检测或基于历史行为的用户行为预测。