Bayesian inference has predominantly relied on the Markov chain Monte Carlo (MCMC) algorithm for many years. However, MCMC is computationally laborious, especially for complex phylogenetic models of time trees. This bottleneck has led to the search for alternatives, such as variational Bayes, which can scale better to large datasets. In this paper, we introduce torchtree, a framework written in Python that allows developers to easily implement rich phylogenetic models and algorithms using a fixed tree topology. One can either use automatic differentiation, or leverage torchtree's plug-in system to compute gradients analytically for model components for which automatic differentiation is slow. We demonstrate that the torchtree variational inference framework performs similarly to BEAST in terms of speed and approximation accuracy. Furthermore, we explore the use of the forward KL divergence as an optimizing criterion for variational inference, which can handle discontinuous and non-differentiable models. Our experiments show that inference using the forward KL divergence tends to be faster per iteration compared to the evidence lower bound (ELBO) criterion, although the ELBO-based inference may converge faster in some cases. Overall, torchtree provides a flexible and efficient framework for phylogenetic model development and inference using PyTorch.
翻译:贝叶斯推断多年来主要依赖马尔可夫链蒙特卡洛(MCMC)算法。然而,MCMC计算量繁重,尤其对于时间树的复杂系统发育模型而言。这一瓶颈促使研究者寻求替代方案,例如变分贝叶斯方法,其能更好地适应大规模数据集。本文介绍torchtree——一个基于Python编写的框架,使开发者能够利用固定树拓扑轻松实现丰富的系统发育模型与算法。用户既可采用自动微分,也可利用torchtree的插件系统对自动微分速度较慢的模型组件进行解析梯度计算。我们证明torchtree变分推断框架在速度与近似精度方面与BEAST表现相当。此外,我们探索了使用前向KL散度作为变分推断优化准则的方法,该准则能够处理不连续且不可微的模型。实验表明,尽管基于证据下界(ELBO)的推断在某些情况下可能收敛更快,但使用前向KL散度的推断在每次迭代中往往速度更优。总体而言,torchtree为基于PyTorch的系统发育模型开发与推断提供了一个灵活高效的框架。