Towards hypergraph cognitive networks as feature-rich models of knowledge

Semantic networks provide a useful tool to understand how related concepts are retrieved from memory. However, most current network approaches use pairwise links to represent memory recall patterns. Pairwise connections neglect higher-order associations, i.e. relationships between more than two concepts at a time. These higher-order interactions might covariate with (and thus contain information about) how similar concepts are along psycholinguistic dimensions like arousal, valence, familiarity, gender and others. We overcome these limits by introducing feature-rich cognitive hypergraphs as quantitative models of human memory where: (i) concepts recalled together can all engage in hyperlinks involving also more than two concepts at once (cognitive hypergraph aspect), and (ii) each concept is endowed with a vector of psycholinguistic features (feature-rich aspect). We build hypergraphs from word association data and use evaluation methods from machine learning features to predict concept concreteness. Since concepts with similar concreteness tend to cluster together in human memory, we expect to be able to leverage this structure. Using word association data from the Small World of Words dataset, we compared a pairwise network and a hypergraph with N=3586 concepts/nodes. Interpretable artificial intelligence models trained on (1) psycholinguistic features only, (2) pairwise-based feature aggregations, and on (3) hypergraph-based aggregations show significant differences between pairwise and hypergraph links. Specifically, our results show that higher-order and feature-rich hypergraph models contain richer information than pairwise networks leading to improved prediction of word concreteness. The relation with previous studies about conceptual clustering and compartmentalisation in associative knowledge and human memory are discussed.

翻译：语义网络为理解相关概念如何从记忆中检索提供了有用工具。然而，当前大多数网络方法使用成对连接来表示记忆回忆模式。成对连接忽略了高阶关联，即同时涉及两个以上概念之间的关系。这些高阶交互可能与概念在唤醒度、效价、熟悉度、性别等心理语言学维度上的相似性共变（从而包含相关信息）。我们通过引入富含特征的认知超图作为人类记忆的量化模型来克服这些局限，其中：（i）被共同回忆的概念均可参与同时涉及两个以上概念的超链接（认知超图层面），（ii）每个概念被赋予一组心理语言学特征向量（富含特征层面）。我们从词语联想数据构建超图，并利用机器学习特征评估方法预测概念具体性。由于具体性相似的概念倾向于在人类记忆中聚集，我们预期能够利用这种结构。基于《词语的小世界》数据集的词语联想数据，我们比较了包含N=3586个概念/节点的成对网络与超图。在（1）仅心理语言学特征、（2）基于成对的特征聚合、（3）基于超图的聚合上训练的可解释人工智能模型，显示了成对连接与超图连接之间的显著差异。具体而言，我们的结果表明，高阶且富含特征的超图模型比成对网络包含更丰富的信息，从而提升了对词语具体性的预测效果。本文还讨论了与联想知识及人类记忆中概念聚类与区隔化相关先前研究的关联。