Hypergraphs are marked by complex topology, expressing higher-order interactions among multiple entities with hyperedges. Lately, hypergraph-based deep learning methods to learn informative data representations for the problem of node classification on text-attributed hypergraphs have garnered increasing research attention. However, existing methods struggle to simultaneously capture the full extent of hypergraph structural information and the rich linguistic attributes inherent in the nodes attributes, which largely hampers their effectiveness and generalizability. To overcome these challenges, we explore ways to further augment a pretrained BERT model with specialized hypergraph-aware layers for the task of node classification. Such layers introduce higher-order structural inductive bias into the language model, thus improving the model's capacity to harness both higher-order context information from the hypergraph structure and semantic information present in text. In this paper, we propose a new architecture, HyperBERT, a mixed text-hypergraph model which simultaneously models hypergraph relational structure while maintaining the high-quality text encoding capabilities of a pre-trained BERT. Notably, HyperBERT presents results that achieve a new state-of-the-art on five challenging text-attributed hypergraph node classification benchmarks.
翻译:超图以其复杂拓扑结构著称,通过超边表达多实体间的高阶交互关系。近年来,基于超图的深度学习方法在文本属性超图的节点分类任务中,通过学习信息丰富的数据表示引起了广泛研究关注。然而,现有方法难以同时捕获超图结构的全部信息与节点属性中蕴含的丰富语言特征,这严重制约了其有效性和泛化能力。为克服这些挑战,我们探索了在预训练BERT模型中注入专门设计的超图感知层以增强节点分类性能的途径。这类层为语言模型引入了高阶结构归纳偏置,从而提升模型同时利用超图结构中的高阶上下文信息与文本语义信息的能力。本文提出新型混合架构HyperBERT,该模型在保持预训练BERT高质量文本编码能力的同时,对超图关系结构进行同步建模。值得注意的是,HyperBERT在五个具有挑战性的文本属性超图节点分类基准测试中取得了新的最优结果。