Structural information of phylogenetic tree topologies plays an important role in phylogenetic inference. However, finding appropriate topological structures for specific phylogenetic inference tasks often requires significant design effort and domain expertise. In this paper, we propose a novel structural representation method for phylogenetic inference based on learnable topological features. By combining the raw node features that minimize the Dirichlet energy with modern graph representation learning techniques, our learnable topological features can provide efficient structural information of phylogenetic trees that automatically adapts to different downstream tasks without requiring domain expertise. We demonstrate the effectiveness and efficiency of our method on a simulated data tree probability estimation task and a benchmark of challenging real data variational Bayesian phylogenetic inference problems.
翻译:系统发育树拓扑的结构信息在系统发育推断中起着重要作用。然而,为特定系统发育推断任务寻找合适的拓扑结构往往需要大量的设计工作和领域专业知识。本文提出了一种基于可学习拓扑特征的新型系统发育推断结构表示方法。通过将最小化狄利克雷能量的原始节点特征与现代图表示学习技术相结合,我们的可学习拓扑特征能够自动适应不同下游任务,无需领域专业知识即可提供系统发育树的有效结构信息。我们在模拟数据树概率估计任务以及具有挑战性的真实数据变分贝叶斯系统发育推断问题基准上,验证了该方法的高效性与有效性。