Inference in non-linear continuous stochastic processes on trees is challenging, particularly when observations are sparse (leaf-only) and the topology is complex. Exact smoothing via Doob's $h$-transform is intractable for general non-linear dynamics, while particle-based methods degrade in high dimensions. We propose Neural Backward Filtering Forward Guiding (NBFFG), a unified framework for both discrete transitions and continuous diffusions. Our method constructs a variational posterior by leveraging an auxiliary linear-Gaussian process. This auxiliary process yields a closed-form backward filter that serves as a ``guide'', steering the generative path toward high-likelihood regions. We then learn a neural residual--parameterized as a normalizing flow or a controlled SDE--to capture the non-linear discrepancies. This formulation allows for an unbiased path-wise subsampling scheme, reducing the training complexity from tree-size dependent to path-length dependent. Empirical results show that NBFFG outperforms baselines on synthetic benchmarks, and we demonstrate the method on a high-dimensional inference task in phylogenetic analysis with reconstruction of ancestral butterfly wing shapes.
翻译:在树结构上进行非线性连续随机过程的推断具有挑战性,尤其是在观测稀疏(仅叶节点)且拓扑结构复杂的情况下。对于一般的非线性动力学,通过Doob的$h$-变换进行精确平滑是难以处理的,而基于粒子的方法在高维情况下性能会下降。我们提出了神经反向滤波前向引导(NBFFG),一个适用于离散转移和连续扩散的统一框架。我们的方法通过利用一个辅助的线性高斯过程来构建变分后验。该辅助过程产生一个封闭形式的反向滤波器,充当“引导”角色,将生成路径导向高似然区域。然后,我们学习一个神经残差——参数化为归一化流或受控随机微分方程——以捕获非线性差异。这种表述允许一种无偏的路径子采样方案,将训练复杂度从依赖于树的大小降低到依赖于路径长度。实证结果表明,NBFFG在合成基准测试中优于基线方法,并且我们通过重建祖先蝴蝶翅膀形状这一高维推断任务展示了该方法在系统发育分析中的应用。