Variational families with full-rank covariance approximations are known not to work well in black-box variational inference (BBVI), both empirically and theoretically. In fact, recent computational complexity results for BBVI have established that full-rank variational families scale poorly with the dimensionality of the problem compared to e.g. mean-field families. This is particularly critical to hierarchical Bayesian models with local variables; their dimensionality increases with the size of the datasets. Consequently, one gets an iteration complexity with an explicit (\mathcal{O}(N^2)) dependence on the dataset size (N). In this paper, we explore a theoretical middle ground between mean-field variational families and full-rank families: structured variational families. We rigorously prove that certain scale matrix structures can achieve a better iteration complexity of (\mathcal{O}\left(N\right)), implying better scaling with respect to (N). We empirically verify our theoretical results on large-scale hierarchical models.
翻译:具有满秩协方差近似的变分族在黑盒变分推断(BBVI)中效果不佳,这在经验上和理论上都是已知的。事实上,BBVI的最新计算复杂性结果已证实,与例如平均场族相比,满秩变分族随问题维度的增加而扩展性很差。这对于具有局部变量的分层贝叶斯模型尤为关键;其维度随数据集大小而增加。因此,迭代复杂度会显式地依赖于数据集大小 (N),表现为 (\mathcal{O}(N^2))。在本文中,我们探索了平均场变分族与满秩族之间的一个理论中间地带:结构化变分族。我们严格证明了某些尺度矩阵结构可以实现 (\mathcal{O}\left(N\right)) 的更好迭代复杂度,这意味着在 (N) 方面具有更好的扩展性。我们在大规模分层模型上实证验证了我们的理论结果。