Answering complex queries on knowledge graphs is important but particularly challenging because of the data incompleteness. Query embedding methods address this issue by learning-based models and simulating logical reasoning with set operators. Previous works focus on specific forms of embeddings, but scoring functions between embeddings are underexplored. In contrast to existing scoring functions motivated by local comparison or global transport, this work investigates the local and global trade-off with unbalanced optimal transport theory. Specifically, we embed sets as bounded measures in $\real$ endowed with a scoring function motivated by the Wasserstein-Fisher-Rao metric. Such a design also facilitates closed-form set operators in the embedding space. Moreover, we introduce a convolution-based algorithm for linear time computation and a block-diagonal kernel to enforce the trade-off. Results show that WFRE can outperform existing query embedding methods on standard datasets, evaluation sets with combinatorially complex queries, and hierarchical knowledge graphs. Ablation study shows that finding a better local and global trade-off is essential for performance improvement.
翻译:知识图谱上的复杂查询回答因数据不完整性而具有挑战性但至关重要。查询嵌入方法通过基于学习的模型并利用集合运算符模拟逻辑推理来解决此问题。现有研究集中于特定嵌入形式,但嵌入间的评分函数仍未得到充分探索。不同于基于局部比较或全局传输的现有评分函数,本研究利用非平衡最优传输理论探究局部与全局的权衡。具体而言,我们将集合编码为 $\real$ 空间上的有界测度,并采用由Wasserstein-Fisher-Rao度量驱动的评分函数。该设计还便于在嵌入空间中实现闭合形式的集合运算符。此外,我们提出基于卷积的线性时间计算算法以及用于强制实现权衡的分块对角核。实验表明,WFRE在标准数据集、包含组合复杂查询的评估集以及层级知识图谱上均能超越现有查询嵌入方法。消融研究显示,寻找更优的局部-全局权衡对性能提升至关重要。