The accurate estimation of the mutual information is a crucial task in various applications, including machine learning, communications, and biology, since it enables the understanding of complex systems. High-dimensional data render the task extremely challenging due to the amount of data to be processed and the presence of convoluted patterns. Neural estimators based on variational lower bounds of the mutual information have gained attention in recent years but they are prone to either high bias or high variance as a consequence of the partition function. We propose a novel class of discriminative mutual information estimators based on the variational representation of the $f$-divergence. We investigate the impact of the permutation function used to obtain the marginal training samples and present a novel architectural solution based on derangements. The proposed estimator is flexible as it exhibits an excellent bias/variance trade-off. Experiments on reference scenarios demonstrate that our approach outperforms state-of-the-art neural estimators both in terms of accuracy and complexity.
翻译:互信息的精确估计是机器学习、通信和生物学等多个应用领域中的关键任务,因为它有助于理解复杂系统。高维数据由于处理数据量庞大且存在复杂模式,使得该任务极具挑战性。基于互信息变分下界的神经估计器近年来受到关注,但由于配分函数的影响,它们容易产生高偏差或高方差。我们提出了一类新颖的判别式互信息估计器,该估计器基于$f$-散度的变分表示。我们研究了用于获取边缘训练样本的置换函数的影响,并提出了一种基于错排的新型架构解决方案。所提出的估计器具有灵活性,表现出优异的偏差/方差权衡。在参考场景上的实验表明,我们的方法在准确性和复杂度方面均优于最先进的神经估计器。