Improving Mutual Information Estimation with Annealed and Energy-Based Bounds

Mutual information (MI) is a fundamental quantity in information theory and machine learning. However, direct estimation of MI is intractable, even if the true joint probability density for the variables of interest is known, as it involves estimating a potentially high-dimensional log partition function. In this work, we present a unifying view of existing MI bounds from the perspective of importance sampling, and propose three novel bounds based on this approach. Since accurate estimation of MI without density information requires a sample size exponential in the true MI, we assume either a single marginal or the full joint density information is known. In settings where the full joint density is available, we propose Multi-Sample Annealed Importance Sampling (AIS) bounds on MI, which we demonstrate can tightly estimate large values of MI in our experiments. In settings where only a single marginal distribution is known, we propose Generalized IWAE (GIWAE) and MINE-AIS bounds. Our GIWAE bound unifies variational and contrastive bounds in a single framework that generalizes InfoNCE, IWAE, and Barber-Agakov bounds. Our MINE-AIS method improves upon existing energy-based methods such as MINE-DV and MINE-F by directly optimizing a tighter lower bound on MI. MINE-AIS uses MCMC sampling to estimate gradients for training and Multi-Sample AIS for evaluating the bound. Our methods are particularly suitable for evaluating MI in deep generative models, since explicit forms of the marginal or joint densities are often available. We evaluate our bounds on estimating the MI of VAEs and GANs trained on the MNIST and CIFAR datasets, and showcase significant gains over existing bounds in these challenging settings with high ground truth MI.

翻译：互信息（MI）是信息论与机器学习中的基本量。然而，即使已知目标变量的真实联合概率密度，直接估计MI仍具挑战性，因为这需要估计可能具有高维度的对数配分函数。本文从重要性采样角度出发，为现有MI估计方法构建了统一视角，并提出三种基于该框架的新型界。由于在缺乏密度信息时精确估计MI所需的样本规模与真实MI呈指数级增长，我们假定已知单一边缘密度或完整联合密度信息。在可获取完整联合密度的场景中，我们提出基于多样本退火重要性采样（AIS）的MI界，实验表明该方法能精确估计高值MI。在仅已知单一边缘分布的场景中，我们提出广义IWAE界（GIWAE）与MINE-AIS界。其中GIWAE界将变分界与对比界统一为单一框架，可推广InfoNCE、IWAE及Barber-Agakov界。MINE-AIS方法通过直接优化更紧的MI下界，显著改进了MINE-DV和MINE-F等现有基于能量的方法：利用MCMC采样估计训练梯度，并通过多样本AIS评估界值。该方法特别适用于深度生成模型的MI评估，因其常可显式表达边缘或联合密度。我们在MNIST和CIFAR数据集上训练VAE和GAN进行MI估计实验，结果表明在具有高真实MI的挑战性场景中，我们的方法较现有方法取得显著改进。