Rather than refining individual candidate solutions for a general non-convex optimization problem, by analogy to evolution, we consider minimizing the average loss of a parametric distribution over hypotheses. In this setting, we prove that Fisher-Rao natural gradient descent (FR-NGD) optimally approximates the continuous-time replicator equation, which is an essential model for evolutionary dynamics, by minimizing the mean-squared error of relative fitness. We term this finding "conjugate natural selection" and demonstrate its utility by numerically solving an example non-convex optimization problem over a continuous strategy space. Next, by developing known connections between discrete-time replicator dynamics and Bayes's rule, we show that FR-NGD of the KL-divergence of modeled predictions from observations in continuous time provides the optimal approximation of continuous Bayesian inference. We use this result to demonstrate a novel method for estimating the parameters of a stochastic processes.
翻译:不同于对一般非凸优化问题中的个体候选解进行精炼,我们类比进化过程,致力于最小化假设空间上参数化分布的平均损失。在此框架下,我们证明Fisher-Rao自然梯度下降(FR-NGD)通过最小化相对适应度的均方误差,能够最优逼近连续时间复制子方程——这一进化动力学的基础模型。我们将此发现称为"共轭自然选择",并通过在连续策略空间上数值求解一个非凸优化实例验证其有效性。进一步,通过建立离散时间复制子动力学与贝叶斯规则之间的已知联系,我们表明连续时间下模型预测与观测数据之间KL散度的FR-NGD能够最优逼近连续贝叶斯推断。基于此结果,我们提出一种估计随机过程参数的新方法。