In a feedforward network, Transfer Entropy (TE) can be used to measure the influence that one layer has on another by quantifying the information transfer between them during training. According to the Information Bottleneck principle, a neural model's internal representation should compress the input data as much as possible while still retaining sufficient information about the output. Information Plane analysis is a visualization technique used to understand the trade-off between compression and information preservation in the context of the Information Bottleneck method by plotting the amount of information in the input data against the compressed representation. The claim that there is a causal link between information-theoretic compression and generalization, measured by mutual information, is plausible, but results from different studies are conflicting. In contrast to mutual information, TE can capture temporal relationships between variables. To explore such links, in our novel approach we use TE to quantify information transfer between neural layers and perform Information Plane analysis. We obtained encouraging experimental results, opening the possibility for further investigations.
翻译:在前馈网络中,迁移熵(TE)可用于衡量训练过程中某一层对另一层的影响,通过量化它们之间的信息迁移。根据信息瓶颈原则,神经模型的内部表示应在尽可能压缩输入数据的同时,仍保留关于输出的足够信息。信息平面分析是一种可视化技术,通过绘制输入数据信息量与压缩表示信息量的关系图,来理解信息瓶颈方法中压缩与信息保留之间的权衡。关于信息论压缩与泛化之间存在因果关系的论断(通过互信息衡量)看似合理,但不同研究的结果相互矛盾。与互信息不同,迁移熵能够捕捉变量之间的时序关系。为探索此类关联,我们在新方法中利用TE量化神经层间的信息迁移,并进行信息平面分析。实验结果令人鼓舞,为后续研究开辟了可能性。