Estimating mutual information accurately is pivotal across diverse applications, from machine learning to communications and biology, enabling us to gain insights into the inner mechanisms of complex systems. Yet, dealing with high-dimensional data presents a formidable challenge, due to its size and the presence of intricate relationships. Recently proposed neural methods employing variational lower bounds on the mutual information have gained prominence. However, these approaches suffer from either high bias or high variance, as the sample size and the structure of the loss function directly influence the training process. In this paper, we propose a novel class of discriminative mutual information estimators based on the variational representation of the $f$-divergence. We investigate the impact of the permutation function used to obtain the marginal training samples and present a novel architectural solution based on derangements. The proposed estimator is flexible since it exhibits an excellent bias/variance trade-off. The comparison with state-of-the-art neural estimators, through extensive experimentation within established reference scenarios, shows that our approach offers higher accuracy and lower complexity.
翻译:准确估计互信息在从机器学习到通信与生物学的多种应用中至关重要,它使我们能够深入理解复杂系统的内在机制。然而,处理高维数据因其规模庞大且存在复杂关系而构成巨大挑战。近期提出的利用互信息变分下界的神经方法已受到广泛关注。然而,这些方法存在高偏差或高方差的问题,因为样本量和损失函数的结构直接影响训练过程。本文提出了一类基于$f$-散度变分表示的新型判别式互信息估计器。我们研究了用于获取边缘训练样本的置换函数的影响,并提出了一种基于错排的新型架构解决方案。所提出的估计器具有灵活性,因其展现出优异的偏差/方差权衡。通过在既定参考场景下的大量实验与最先进的神经估计器进行比较,结果表明我们的方法具有更高的准确性和更低的复杂度。