Rather than refining individual candidate solutions for a general non-convex optimization problem, by analogy to evolution, we consider minimizing the average loss for a parametric distribution over hypotheses. In this setting, we prove that Fisher-Rao natural gradient descent (FR-NGD) optimally approximates the continuous-time replicator equation (an essential model of evolutionary dynamics) by minimizing the mean-squared error for the relative fitness of competing hypotheses. We term this finding "conjugate natural selection" and demonstrate its utility by numerically solving an example non-convex optimization problem over a continuous strategy space. Next, by developing known connections between discrete-time replicator dynamics and Bayes's rule, we show that when absolute fitness corresponds to the negative KL-divergence of a hypothesis's predictions from actual observations, FR-NGD provides the optimal approximation of continuous Bayesian inference. We use this result to demonstrate a novel method for estimating the parameters of stochastic processes.
翻译:针对一般非凸优化问题,不同于对单个候选解进行精炼,我们类比进化过程,考虑最小化参数化假设分布下的平均损失。在此框架下,我们证明Fisher-Rao自然梯度下降(FR-NGD)通过最小化竞争假设相对适应度的均方误差,最优逼近连续时间复制方程(进化动力学的核心模型)。我们将这一发现命名为"共轭自然选择",并通过数值求解连续策略空间上的非凸优化问题实例验证其有效性。进一步地,通过建立离散复制动力学与贝叶斯法则之间的已知联系,我们证明当绝对适应度对应假设预测与实际观测的负KL散度时,FR-NGD可提供连续贝叶斯推理的最优逼近。基于该结果,我们提出一种估计随机过程参数的新方法。