Undirected graphical models are widely used to model the conditional independence structure of vector-valued data. However, in many modern applications, for example those involving EEG and fMRI data, observations are more appropriately modeled as multivariate random functions rather than vectors. Functional graphical models have been proposed to model the conditional independence structure of such functional data. We propose a neighborhood selection approach to estimate the structure of Gaussian functional graphical models, where we first estimate the neighborhood of each node via a function-on-function regression and subsequently recover the entire graph structure by combining the estimated neighborhoods. Our approach only requires assumptions on the conditional distributions of random functions, and we estimate the conditional independence structure directly. We thus circumvent the need for a well-defined precision operator that may not exist when the functions are infinite dimensional. Additionally, the neighborhood selection approach is computationally efficient and can be easily parallelized. The statistical consistency of the proposed method in the high-dimensional setting is supported by both theory and experimental results. In addition, we study the effect of the choice of the function basis used for dimensionality reduction in an intermediate step. We give a heuristic criterion for choosing a function basis and motivate two practically useful choices, which we justify by both theory and experiments.
翻译:无向图模型广泛用于建模向量数据的条件独立性结构。然而,在许多现代应用中(例如涉及脑电图和功能磁共振成像数据的场景),观测值更适宜建模为多元随机函数而非向量。函数图模型已被提出用于建模此类函数数据的条件独立性结构。我们提出一种邻域选择方法来估计高斯函数图模型的结构:首先通过函数对函数回归估计每个节点的邻域,随后通过组合估计的邻域恢复整个图结构。该方法仅需对随机函数的条件分布做出假设,并直接估计条件独立性结构,从而避免了在函数为无限维时可能不存在的精密度算子。此外,邻域选择方法计算高效且易于并行化。理论结果与实验验证共同支持了所提方法在高维场景下的统计一致性。同时,我们研究了中间步骤中用于降维的函数基选择的影响,给出了函数基选择的启发式准则,并提出了两种具有实践价值的基函数选择,这些选择在理论与实验中均得到验证。