Estimating mutual correlations between random variables or data streams is essential for intelligent behavior and decision-making. As a fundamental quantity for measuring statistical relationships, mutual information has been extensively studied and utilized for its generality and equitability. However, existing methods often lack the efficiency needed for real-time applications, such as test-time optimization of a neural network, or the differentiability required for end-to-end learning, like histograms. We introduce a neural network called InfoNet, which directly outputs mutual information estimations of data streams by leveraging the attention mechanism and the computational efficiency of deep learning infrastructures. By maximizing a dual formulation of mutual information through large-scale simulated training, our approach circumvents time-consuming test-time optimization and offers generalization ability. We evaluate the effectiveness and generalization of our proposed mutual information estimation scheme on various families of distributions and applications. Our results demonstrate that InfoNet and its training process provide a graceful efficiency-accuracy trade-off and order-preserving properties. We will make the code and models available as a comprehensive toolbox to facilitate studies in different fields requiring real-time mutual information estimation.
翻译:估计随机变量或数据流之间的相互关联对于智能行为与决策至关重要。作为衡量统计关系的基础量,互信息因其通用性与等变性被广泛研究与应用。然而,现有方法常缺乏实时应用所需的效率(例如神经网络的测试时优化)或端到端学习所需的可微性(如直方图方法)。我们提出一种名为InfoNet的神经网络,通过利用注意力机制与深度学习基础设施的计算效率,直接输出数据流的互信息估计值。通过大规模模拟训练最大化互信息的对偶形式,我们的方法避免了耗时的测试时优化,并具备泛化能力。我们评估了所提出的互信息估计方案在多种分布族与不同应用场景中的有效性与泛化性。结果表明,InfoNet及其训练过程在效率与精度之间实现了优雅的平衡,并具备保序特性。我们将公开发布代码与模型作为综合工具包,以促进不同领域中需要实时互信息估计的研究。