Previous work has established that neural network-based node embeddings return different outcomes when trained with identical parameters on the same dataset, just from using different training seeds. Yet, it has not been thoroughly analyzed how key hyperparameters such as embedding dimension could impact this instability. In this work, we investigate how varying the dimensionality of node embeddings influences both their stability and downstream performance. We systematically evaluate five widely used methods -- ASNE, DGI, GraphSAGE, node2vec, and VERSE -- across multiple datasets and embedding dimensions. We assess stability from both a representational perspective and a functional perspective, alongside performance evaluation. Our results show that embedding stability varies significantly with dimensionality, but we observe different patterns across the methods we consider: while some approaches, such as node2vec and ASNE, tend to become more stable with higher dimensionality, other methods do not exhibit the same trend. Moreover, we find that maximum stability does not necessarily align with optimal task performance. These findings highlight the importance of carefully selecting embedding dimension, and provide new insights into the trade-offs between stability, performance, and computational effectiveness in graph representation learning.
翻译:先前研究已证实,基于神经网络的节点嵌入方法在使用相同参数与数据集进行训练时,仅因采用不同随机种子就会产生不同的嵌入结果。然而,嵌入维度等关键超参数对这类不稳定性究竟产生何种影响仍缺乏深入分析。本研究通过系统评估五种主流方法——ASNE、DGI、GraphSAGE、node2vec和VERSE——在多个数据集与不同嵌入维度下的表现,深入探究维度变化如何影响嵌入稳定性及下游任务性能。我们从表征视角与功能视角双维度评估稳定性,同时结合性能进行综合评价。实验结果表明,嵌入稳定性随维度变化呈现显著差异,但不同方法展现出迥异的规律:node2vec与ASNE等方法随维度提升趋于稳定,而其他方法则未呈现这一趋势。此外,研究发现最大稳定性并不必然对应最优任务性能。这些发现凸显了审慎选择嵌入维度的重要性,并为图表示学习中稳定性、性能与计算效率之间的权衡提供了全新见解。