Statistical limits of correlation detection in trees

In this paper we address the problem of testing whether two observed trees $(t,t')$ are sampled either independently or from a joint distribution under which they are correlated. This problem, which we refer to as correlation detection in trees, plays a key role in the study of graph alignment for two correlated random graphs. Motivated by graph alignment, we investigate the conditions of existence of one-sided tests, i.e. tests which have vanishing type I error and non-vanishing power in the limit of large tree depth. For the correlated Galton-Watson model with Poisson offspring of mean $\lambda>0$ and correlation parameter $s \in (0,1)$, we identify a phase transition in the limit of large degrees at $s = \sqrt{\alpha}$, where $\alpha \sim 0.3383$ is Otter's constant. Namely, we prove that no such test exists for $s \leq \sqrt{\alpha}$, and that such a test exists whenever $s > \sqrt{\alpha}$, for $\lambda$ large enough. This result sheds new light on the graph alignment problem in the sparse regime (with $O(1)$ average node degrees) and on the performance of the MPAlign method studied in Ganassali et al. (2021), Piccioli et al. (2021), proving in particular the conjecture of Piccioli et al. (2021) that MPAlign succeeds in the partial recovery task for correlation parameter $s>\sqrt{\alpha}$ provided the average node degree $\lambda$ is large enough. As a byproduct, we identify a new family of orthogonal polynomials for the Poisson-Galton-Watson measure which enjoy remarkable properties. These polynomials may be of independent interest for a variety of problems involving graphs, trees or branching processes, beyond the scope of graph alignment.

翻译：本文研究如何检验两个观测树 $(t,t')$ 是独立采样还是来自存在关联的联合分布。我们将此问题称为树的关联性检测，它在两个相关随机图的图对齐研究中扮演关键角色。受图对齐问题的驱动，我们探讨单侧检验存在的条件，即在大树深度极限下具有渐近消失第一类错误与非零检验功效的检验。针对具有平均后代数 $\lambda>0$ 的泊松分布与关联参数 $s \in (0,1)$ 的相关伽尔顿-沃森模型，我们发现在大度数极限下存在相变临界点 $s = \sqrt{\alpha}$，其中 $\alpha \sim 0.3383$ 为奥特常数。具体而言，我们证明当 $s \leq \sqrt{\alpha}$ 时不存在此类检验，而当 $s > \sqrt{\alpha}$ 且 $\lambda$ 足够大时存在此类检验。该结果揭示了稀疏情形（节点平均度为 $O(1)$）下图对齐问题的新视角，并阐明了 Ganassali 等人（2021）及 Piccioli 等人（2021）提出的 MPAlign 方法的性能，特别证实了 Piccioli 等人（2021）的猜想：当关联参数 $s>\sqrt{\alpha}$ 且平均节点度 $\lambda$ 足够大时，MPAlign 方法能成功完成部分恢复任务。作为副产品，我们发现了针对泊松-伽尔顿-沃森测度的一类具有卓越性质的新型正交多项式。这些多项式对涉及图、树或分支过程的各类问题（超越图对齐范畴）可能具有独立的研究价值。