A Risk Management Perspective on Statistical Estimation and Generalized Variational Inference

Generalized variational inference (GVI) provides an optimization-theoretic framework for statistical estimation that encapsulates many traditional estimation procedures. The typical GVI problem is to compute a distribution of parameters that maximizes the expected payoff minus the divergence of the distribution from a specified prior. In this way, GVI enables likelihood-free estimation with the ability to control the influence of the prior by tuning the so-called learning rate. Recently, GVI was shown to outperform traditional Bayesian inference when the model and prior distribution are misspecified. In this paper, we introduce and analyze a new GVI formulation based on utility theory and risk management. Our formulation is to maximize the expected payoff while enforcing constraints on the maximizing distribution. We recover the original GVI distribution by choosing the feasible set to include a constraint on the divergence of the distribution from the prior. In doing so, we automatically determine the learning rate as the Lagrange multiplier for the constraint. In this setting, we are able to transform the infinite-dimensional estimation problem into a two-dimensional convex program. This reformulation further provides an analytic expression for the optimal density of parameters. In addition, we prove asymptotic consistency results for empirical approximations of our optimal distributions. Throughout, we draw connections between our estimation procedure and risk management. In fact, we demonstrate that our estimation procedure is equivalent to evaluating a risk measure. We test our procedure on an estimation problem with a misspecified model and prior distribution, and conclude with some extensions of our approach.

翻译：广义变分推断（GVI）为统计估计提供了一个基于优化理论的框架，涵盖了许多传统估计方法。典型的GVI问题是计算一个参数分布，该分布能最大化期望收益减去该分布与指定先验分布之间的散度。通过这种方式，GVI实现了无似然估计，并可通过调整所谓的学习率来控制先验分布的影响。近期研究表明，当模型和先验分布设定错误时，GVI优于传统贝叶斯推断。本文引入并分析了一种基于效用理论和风险管理的新GVI公式。我们的公式在最大化期望收益的同时，对最大化分布施加约束条件。通过选择包含对分布与先验间散度约束的可行集，我们恢复了原始GVI分布。在此过程中，我们自动将学习率确定为该约束的拉格朗日乘子。在这种设定下，我们能够将无限维估计问题转化为二维凸规划问题。这种重构进一步给出了参数最优密度的解析表达式。此外，我们证明了最优分布经验近似的渐近一致性结果。全文贯穿了我们的估计程序与风险管理之间的联系。事实上，我们证明该估计程序等价于评估一种风险度量。我们在模型和先验分布设定错误的估计问题上测试了该程序，并最后给出了方法的一些扩展。