Random Projections have been widely used to generate embeddings for various graph tasks due to their computational efficiency. The majority of applications have been justified through the Johnson-Lindenstrauss Lemma. In this paper, we take a step further and investigate how well dot product and cosine similarity are preserved by Random Projections. Our analysis provides new theoretical results, identifies pathological cases, and tests them with numerical experiments. We find that, for nodes of lower or higher degrees, the method produces especially unreliable embeddings for the dot product, regardless of whether the adjacency or the (normalized version) transition is used. With respect to the statistical noise introduced by Random Projections, we show that cosine similarity produces remarkably more precise approximations.
翻译:随机投影因其计算效率被广泛用于各类图任务的嵌入生成。大多数应用基于Johnson-Lindenstrauss引理得到理论支撑。本文进一步探究随机投影对点积和余弦相似度的保持能力,通过理论分析揭示新结论、识别病态案例,并基于数值实验进行验证。研究发现,对于低度或高度节点,无论采用邻接矩阵还是(归一化后的)转移矩阵,该方法产生的点积嵌入均存在显著不可靠性。针对随机投影引入的统计噪声,我们证明余弦相似度能够提供精度显著更高的近似结果。