We conducted a comparative analysis of the performance of modularity-based methods for clustering nodes in binary hypergraphs. Statistical analysis and node clustering in hypergraphs constitute an emerging topic suffering from a lack of standardization. In contrast to the case of graphs, the concept of nodes' community in hypergraphs is not unique and encompasses various distinct situations. To address this, we begin by presenting, within a unified framework, the various hypergraph modularity criteria proposed in the literature, emphasizing their differences and respective focuses. Subsequently, we provide an overview of the state-of-the-art codes available to maximize hypergraph modularities for detecting node communities in binary hypergraphs. Through exploration of various simulation settings with controlled ground truth clustering, we offer a comparison of these methods using different quality measures, including true clustering recovery, running time, (local) maximization of the objective, and the number of clusters detected. Our contribution marks the first attempt to clarify the advantages and drawbacks of these newly available methods. This effort lays the foundation for a better understanding of the primary objectives of modularity-based node clustering methods for binary hypergraphs.
翻译:我们对基于模块度的二值超图节点聚类方法性能进行了比较分析。超图中的统计分析与节点聚类是一个新兴课题,目前尚缺乏标准化规范。与普通图不同,超图中节点社区的概念并非唯一,涵盖多种不同情形。为此,我们首先在统一框架内阐述文献中提出的各类超图模块度准则,重点揭示其差异与各自侧重点。随后,我们综述了现有最先进的用于最大化超图模块度以检测二值超图节点社区的代码实现。通过设置包含受控真实聚类的多种仿真场景,我们采用真实聚类恢复效果、运行时间、目标函数(局部)最大化程度以及检测到的聚类数量等不同质量指标,对这些方法进行了比较。本研究首次尝试阐明这些新兴方法的优缺点,为深入理解基于模块度的二值超图节点聚类方法的主要目标奠定了基础。