We present the analysis of the topological graph descriptor Local Degree Profile (LDP), which forms a widely used structural baseline for graph classification. Our study focuses on model evaluation in the context of the recently developed fair evaluation framework, which defines rigorous routines for model selection and evaluation for graph classification, ensuring reproducibility and comparability of the results. Based on the obtained insights, we propose a new baseline algorithm called Local Topological Profile (LTP), which extends LDP by using additional centrality measures and local vertex descriptors. The new approach provides the results outperforming or very close to the latest GNNs for all datasets used. Specifically, state-of-the-art results were obtained for 4 out of 9 benchmark datasets. We also consider computational aspects of LDP-based feature extraction and model construction to propose practical improvements affecting execution speed and scalability. This allows for handling modern, large datasets and extends the portfolio of benchmarks used in graph representation learning. As the outcome of our work, we obtained LTP as a simple to understand, fast and scalable, still robust baseline, capable of outcompeting modern graph classification models such as Graph Isomorphism Network (GIN). We provide open-source implementation at \href{https://github.com/j-adamczyk/LTP}{GitHub}.
翻译:我们分析了拓扑图描述符局部度特征(LDP),该描述符是一种广泛用于图分类的结构性基线。本研究聚焦于在近期提出的公平评估框架下的模型评估,该框架定义了严格的图分类模型选择与评估流程,确保结果的可复现性与可比性。基于研究洞察,我们提出了新基线算法——局部拓扑特征(LTP),通过引入额外的中心性度量和局部顶点描述符扩展了LDP。新方法在所有数据集上的表现优于或接近最新GNN模型,具体而言,在9个基准数据集中的4个上取得了最优结果。我们还探讨了基于LDP的特征提取与模型构建的计算方面,提出了提升执行速度与可扩展性的实用改进方案。这使得算法能够处理现代大规模数据集,拓展了图表示学习中的基准组合。最终,我们获得了一个简单易懂、快速可扩展且稳健的LTP基线,能够超越图同构网络(GIN)等现代图分类模型。我们已在GitHub(\href{https://github.com/j-adamczyk/LTP}{公开链接})上开源实现代码。