When approximating an intractable density via variational inference (VI) the variational family is typically chosen as a simple parametric family that very likely does not contain the target. This raises the question: Under which conditions can we recover characteristics of the target despite misspecification? In this work, we extend previous theoretical results on robust VI with location-scale families under target symmetries in two substantial ways: (1) We open them up to a wider range of divergences by providing sufficient conditions for exact recovery of the target mean and correlation matrix when using the forward Kullback-Leibler divergence and $α$-divergences. (2) By doing so, we find that we can drop the restrictive assumption of a log-concave target made in previous work, allowing us to give guarantees for a wider range of targets, including multi-modal ones. In our experiments, we show how our guarantees can serve as guidelines for the choice of the variational family and $α$-value and we illustrate on a diverse set of examples how and why optimization can fail in the absence of our sufficient conditions.
翻译:当通过变分推断(VI)近似不可处理密度时,变分族通常被选为简单的参数族,这极有可能不包含目标密度。这引发了一个问题:在模型设定错误的情况下,我们能否恢复目标特征?本文从两个重要方向扩展了先前关于定位-尺度族在目标对称性下鲁棒变分推断的理论结果:(1)我们将其推广到更广泛的散度范围,通过提供使用前向Kullback-Leibler散度和$α$-散度时精确恢复目标均值与相关矩阵的充分条件。(2)由此我们发现,可取消先前研究中要求目标为对数凹密度的限制性假设,从而为包括多模态在内的更广泛目标提供保障。实验表明,我们的保障可作为选择变分族和$α$值的指导准则,并通过多样化的算例展示在缺乏充分条件时优化可能失败的原因与机制。