Random Projections have been widely used to generate embeddings for various graph learning tasks due to their computational efficiency. The majority of applications have been justified through the Johnson-Lindenstrauss Lemma. In this paper, we take a step further and investigate how well dot product and cosine similarity are preserved by random projections when these are applied over the rows of the graph matrix. Our analysis provides new asymptotic and finite-sample results, identifies pathological cases, and tests them with numerical experiments. We specialize our fundamental results to a ranking application by computing the probability of random projections flipping the node ordering induced by their embeddings. We find that, depending on the degree distribution, the method produces especially unreliable embeddings for the dot product, regardless of whether the adjacency or the normalized transition matrix is used. With respect to the statistical noise introduced by random projections, we show that cosine similarity produces remarkably more precise approximations.
翻译:随机投影因其计算效率已被广泛用于生成各类图学习任务的嵌入表示。大多数应用都基于Johnson-Lindenstrauss引理得到理论支撑。本文进一步研究当随机投影作用于图矩阵的行向量时,点积与余弦相似度能被保留到何种程度。我们的分析提供了新的渐近结果与有限样本结论,识别出病态情形,并通过数值实验进行验证。我们将基础理论结果具体应用于排序任务,通过计算随机投影改变节点嵌入排序顺序的概率进行分析。研究发现,根据度分布的不同,无论采用邻接矩阵还是归一化转移矩阵,该方法生成的点积嵌入都表现出特别不可靠的特性。关于随机投影引入的统计噪声,我们证明余弦相似度能产生显著更精确的近似结果。