Hypergraphs are characterized by complex topological structure, representing higher-order interactions among multiple entities through hyperedges. Lately, hypergraph-based deep learning methods to learn informative data representations for the problem of node classification on text-attributed hypergraphs have garnered increasing research attention. However, existing methods struggle to simultaneously capture the full extent of hypergraph structural information and the rich linguistic attributes inherent in the nodes attributes, which largely hampers their effectiveness and generalizability. To overcome these challenges, we explore ways to further augment a pretrained BERT model with specialized hypergraph-aware layers for the task of node classification. Such layers introduce higher-order structural inductive bias into the language model, thus improving the model's capacity to harness both higher-order context information from the hypergraph structure and semantic information present in text. In this paper, we propose a new architecture, HyperBERT, a mixed text-hypergraph model which simultaneously models hypergraph relational structure while maintaining the high-quality text encoding capabilities of a pre-trained BERT. Notably, HyperBERT presents results that achieve a new state-of-the-art on five challenging text-attributed hypergraph node classification benchmarks.
翻译:超图以其复杂的拓扑结构为特征,通过超边表示多个实体间的高阶交互。近年来,基于超图的深度学习方法在学习文本属性超图的节点分类任务的信息化数据表示方面,日益受到研究关注。然而,现有方法难以同时捕捉超图结构信息的完整范围与节点属性中固有的丰富语言特征,这严重限制了其有效性与泛化能力。为克服这些挑战,我们探索了如何针对节点分类任务,用专门的超图感知层进一步增强预训练的BERT模型。此类层将高阶结构归纳偏置引入语言模型,从而提升模型利用超图结构中的高阶上下文信息及文本中语义信息的能力。本文提出一种新架构HyperBERT,这是一种混合文本-超图模型,能够在建模超图关系结构的同时保持预训练BERT的高质量文本编码能力。值得注意的是,HyperBERT在五个具有挑战性的文本属性超图节点分类基准测试中取得了新的最优性能。