Graph auto-encoders are widely used to construct graph representations in Euclidean vector spaces. However, it has already been pointed out empirically that linear models on many tasks can outperform graph auto-encoders. In our work, we prove that the solution space induced by graph auto-encoders is a subset of the solution space of a linear map. This demonstrates that linear embedding models have at least the representational power of graph auto-encoders based on graph convolutional networks. So why are we still using nonlinear graph auto-encoders? One reason could be that actively restricting the linear solution space might introduce an inductive bias that helps improve learning and generalization. While many researchers believe that the nonlinearity of the encoder is the critical ingredient towards this end, we instead identify the node features of the graph as a more powerful inductive bias. We give theoretical insights by introducing a corresponding bias in a linear model and analyzing the change in the solution space. Our experiments are aligned with other empirical work on this question and show that the linear encoder can outperform the nonlinear encoder when using feature information.
翻译:图自编码器广泛用于在欧几里得向量空间中构建图表示。然而,已有经验研究表明,在许多任务中线性模型可以胜过图自编码器。在我们的工作中,我们证明了图自编码器诱导的解空间是线性映射解空间的子集。这表明线性嵌入模型至少具有基于图卷积网络的图自编码器的表示能力。那么,为什么我们仍在使用非线性图自编码器?一个原因可能是,主动限制线性解空间可能会引入一种归纳偏置,有助于改进学习和泛化。虽然许多研究者认为编码器的非线性是实现这一目标的关键因素,但我们却指出图的节点特征是一种更强大的归纳偏置。我们通过在线性模型中引入相应的偏置并分析解空间的变化,给出了理论见解。我们的实验与关于该问题的其他经验工作一致,表明在使用特征信息时,线性编码器可以胜过非线性编码器。