The evolution of molecular and phenotypic traits is commonly modelled using Markov processes along a rooted phylogeny. This phylogeny can be a tree, or a network if it includes reticulations, representing events such as hybridization or admixture. Computing the likelihood of data observed at the leaves is costly as the size and complexity of the phylogeny grows. Efficient algorithms exist for trees, but cannot be applied to networks. We show that a vast array of models for trait evolution along phylogenetic networks can be reformulated as graphical models, for which efficient belief propagation algorithms exist. We provide a brief review of belief propagation on general graphical models, then focus on linear Gaussian models for continuous traits. We show how belief propagation techniques can be applied for exact or approximate (but more scalable) likelihood and gradient calculations, and prove novel results for efficient parameter inference of some models. We highlight the possible fruitful interactions between graphical models and phylogenetic methods. For example, approximate likelihood approaches have the potential to greatly reduce computational costs for phylogenies with reticulations.
翻译:分子和表型性状的进化通常通过沿有根系统发育的马尔可夫过程进行建模。该系统发育可以是树状结构,若包含网状演化事件(如杂交或基因渗入),则形成网络结构。随着系统发育规模和复杂性的增加,计算叶片观测数据的似然性成本急剧上升。针对树状结构虽存在高效算法,但无法直接应用于网络。本文证明,沿系统发育网络的性状进化模型可被重构为图模型,从而应用高效的信度传播算法。我们首先概述通用图模型中的信度传播方法,进而聚焦连续性状的线性高斯模型。研究表明,信度传播技术可应用于似然计算与梯度计算的精确或近似(更具可扩展性)求解,并针对部分模型提出参数高效推断的新结论。我们强调图模型与系统发育方法之间富有成效的交叉潜力,例如近似似然方法有望大幅降低含网状演化系统发育的计算成本。