QuanTaxo: A Quantum Approach to Self-Supervised Taxonomy Expansion

A taxonomy is a hierarchical graph containing knowledge to provide valuable insights for various web applications. Online retail organizations like Microsoft and Amazon utilize taxonomies to improve product recommendations and optimize advertisement by enhancing query interpretation. However, the manual construction of taxonomies requires significant human effort. As web content continues to expand at an unprecedented pace, existing taxonomies risk becoming outdated, struggling to incorporate new and emerging information effectively. As a consequence, there is a growing need for dynamic taxonomy expansion to keep them relevant and up-to-date. Existing taxonomy expansion methods often rely on classical word embeddings to represent entities. However, these embeddings fall short in capturing hierarchical polysemy, where an entity's meaning can vary based on its position in the hierarchy and its surrounding context. To address this challenge, we introduce QuanTaxo, an innovative quantum-inspired framework for taxonomy expansion. QuanTaxo encodes entity representations in quantum space, effectively modeling hierarchical polysemy by leveraging the principles of Hilbert space to capture interference effects between entities, yielding richer and more nuanced representations. Comprehensive experiments on four real-world benchmark datasets show that QuanTaxo significantly outperforms classical embedding models, achieving substantial improvements of 18.45% in accuracy, 20.5% in Mean Reciprocal Rank, and 17.87% in Wu & Palmer metrics across eight classical embedding-based baselines. We further highlight the superiority of QuanTaxo through extensive ablation and case studies.

翻译：分类体系是一种包含知识的层次图，能为各类网络应用提供有价值的洞察。微软和亚马逊等在线零售企业利用分类体系改进产品推荐，并通过增强查询理解来优化广告投放。然而，人工构建分类体系需要耗费大量人力。随着网络内容以前所未有的速度持续扩张，现有分类体系面临过时的风险，难以有效纳入新兴信息。因此，动态扩展分类体系以保持其相关性和时效性的需求日益增长。现有分类体系扩展方法通常依赖经典词嵌入来表示实体，但这些嵌入在捕捉层次性一词多义方面存在不足——实体的含义可能因其在层次结构中的位置及上下文语境而变化。为应对这一挑战，我们提出了QuanTaxo，一种创新的量子启发式分类体系扩展框架。QuanTaxo在量子空间中编码实体表示，通过利用希尔伯特空间原理捕捉实体间的干涉效应，有效建模层次性一词多义，从而产生更丰富、更细致的表示。在四个真实世界基准数据集上的综合实验表明，QuanTaxo显著优于经典嵌入模型，在八种基于经典嵌入的基线方法上，准确率平均提升18.45%，平均倒数排名提升20.5%，Wu & Palmer度量提升17.87%。我们通过深入的消融实验和案例研究进一步验证了QuanTaxo的优越性。