In many applications, weighted networks are constructed based on time series data: each time series is associated to a vertex and edge weights are given by pairwise correlations. The result is a network whose edge dependency structure violates the assumptions of most common network models. Nonetheless, it is common to analyze these "correlation networks" using embedding methods derived from edge-independent network models, based on a belief that the edges are approximately independent. In this work, we put this modeling choice on firm theoretical ground. We show that when the time series are expressible in terms of a small number of Fourier basis elements (or in some other suitably-chosen basis), correlation networks correspond to latent space networks with dependent edge noise in which the vertex-level latent variables encode the basis coefficients. Further, we show that when time series are observed subject to noise, spectral embedding of the resulting noisy correlation network still recovers these true vertex-level latent representations under suitable assumptions. This characterization of embeddings as learning Fourier coefficients appears to be folklore in the signal processing community in the context of principal component analysis, but is, to the best of our knowledge, new to the statistical network analysis literature.
翻译:在许多应用中,加权网络是基于时间序列数据构建的:每个时间序列对应一个顶点,边权重由成对相关性给出。由此产生的网络,其边依赖结构违反了大多数常见网络模型的假设。尽管如此,分析这些“相关网络”时,通常仍使用源自边独立网络模型的嵌入方法,其依据是认为这些边近似独立。在本工作中,我们为这一建模选择提供了坚实的理论基础。我们证明,当时间序列可以用少量傅里叶基元素(或其他适当选择的基)表示时,相关网络对应于具有依赖边噪声的潜在空间网络,其中顶点级潜在变量编码了基系数。此外,我们证明,当时间序列在噪声干扰下被观测时,在适当假设下,对由此产生的含噪相关网络进行谱嵌入,仍能恢复这些真实的顶点级潜在表示。这种将嵌入视为学习傅里叶系数的特性,在信号处理领域的主成分分析背景下似乎是众所周知的,但据我们所知,在统计网络分析文献中尚属首次提出。