Gaussian mixture distributions are commonly employed to represent general probability distributions. Despite the importance of using Gaussian mixtures for uncertainty estimation, the entropy of a Gaussian mixture cannot be analytically calculated. Notably, Gal and Ghahramani [2016] proposed the approximate entropy that is the sum of the entropies of unimodal Gaussian distributions. This approximation is easy to analytically calculate regardless of dimension, but there lack theoretical guarantees. In this paper, we theoretically analyze the approximation error between the true entropy and the approximate one to reveal when this approximation works effectively. This error is controlled by how far apart each Gaussian component of the Gaussian mixture. To measure such separation, we introduce the ratios of the distances between the means to the sum of the variances of each Gaussian component of the Gaussian mixture, and we reveal that the error converges to zero as the ratios tend to infinity. This convergence situation is more likely to occur in higher dimensional spaces. Therefore, our results provide a guarantee that this approximation works well in higher dimension problems, particularly in scenarios such as neural networks that involve a large number of weights.
翻译:高斯混合分布通常用于表示一般概率分布。尽管使用高斯混合进行不确定性估计具有重要意义,但高斯混合的熵无法解析计算。值得注意的是,Gal和Ghahramani [2016] 提出了近似熵,即单峰高斯分布熵之和。该近似易于解析计算且与维度无关,但缺乏理论保证。本文从理论上分析了真实熵与近似熵之间的误差,以揭示该近似何时有效。该误差受高斯混合中各个高斯分量之间的分离程度控制。为衡量这种分离,我们引入了高斯混合中每个高斯分量的均值距离与方差之和的比值,并揭示了当该比值趋于无穷大时,误差趋近于零。这种收敛情况在高维空间中更可能发生。因此,我们的结果为该近似在高维问题(特别是涉及大量权重的神经网络等场景)中的良好表现提供了理论保证。