Diffusion bridge models have recently become a powerful tool in the field of generative modeling. In this work, we leverage their power to address another important problem in machine learning and information theory, the estimation of the mutual information (MI) between two random variables. Neatly framing MI estimation as a domain transfer problem, we construct an unbiased estimator for data posing difficulties for conventional MI estimators. We showcase the performance of our estimator on three standard MI estimation benchmarks, i.e., low-dimensional, image-based and high MI, and on real-world data, i.e., protein language model embeddings.
翻译:扩散桥模型近期已成为生成建模领域的有力工具。本研究利用其能力解决机器学习与信息论中的另一重要问题:两个随机变量间互信息的估计。通过将互信息估计巧妙地构建为领域迁移问题,我们构建了一种针对传统互信息估计器难以处理数据的无偏估计器。我们在三个标准互信息估计基准(即低维数据、图像数据和高互信息场景)以及真实世界数据(即蛋白质语言模型嵌入)上展示了所提估计器的性能。