Transfer entropy (TE) is a measurement in information theory that reveals the directional flow of information between processes, providing valuable insights for a wide range of real-world applications. This work proposes Transfer Entropy Estimation via Transformers (TREET), a novel transformer-based approach for estimating the TE for stationary processes. The proposed approach employs Donsker-Vardhan (DV) representation to TE and leverages the attention mechanism for the task of neural estimation. We propose a detailed theoretical and empirical study of the TREET, comparing it to existing methods. To increase its applicability, we design an estimated TE optimization scheme that is motivated by the functional representation lemma. Afterwards, we take advantage of the joint optimization scheme to optimize the capacity of communication channels with memory, which is a canonical optimization problem in information theory, and show the memory capabilities of our estimator. Finally, we apply TREET to real-world feature analysis. Our work, applied with state-of-the-art deep learning methods, opens a new door for communication problems which are yet to be solved.
翻译:转移熵是信息论中一种揭示过程间信息方向性流动的度量,为广泛的实际应用提供了宝贵见解。本文提出基于Transformer的转移熵估计方法(TREET),这是一种新颖的基于Transformer的方法,用于估计平稳过程的转移熵。所提方法采用Donsker-Vardhan (DV) 表示来刻画转移熵,并利用注意力机制完成神经估计任务。我们从理论与实证两个层面详细研究了TREET,并与现有方法进行了比较。为增强其适用性,我们基于函数表示引理设计了估计转移熵的优化方案。随后,利用联合优化方案优化了带记忆通信信道的容量——这是信息论中的经典优化问题,并展示了估计器的记忆能力。最后,我们将TREET应用于真实特征分析。本工作结合前沿深度学习方法,为尚未解决的通信问题打开了新的大门。