With the success of self-supervised representations, researchers seek a better understanding of the information encapsulated within a representation. Among various interpretability methods, we focus on classification-based linear probing. We aim to foster a solid understanding and provide guidelines for linear probing by constructing a novel mathematical framework leveraging information theory. First, we connect probing with the variational bounds of mutual information (MI) to relax the probe design, equating linear probing with fine-tuning. Then, we investigate empirical behaviors and practices of probing through our mathematical framework. We analyze the layer-wise performance curve being convex, which seemingly violates the data processing inequality. However, we show that the intermediate representations can have the biggest MI estimate because of the tradeoff between better separability and decreasing MI. We further suggest that the margin of linearly separable representations can be a criterion for measuring the "goodness of representation." We also compare accuracy with MI as the measuring criteria. Finally, we empirically validate our claims by observing the self-supervised speech models on retaining word and phoneme information.
翻译:随着自监督表示的成功,研究者们希望更深入地理解表示中蕴含的信息。在各种可解释性方法中,我们聚焦于基于分类的线性探测。通过构建一个基于信息论的新型数学框架,我们旨在夯实对线性探测的理解并提供指导原则。首先,我们将探测与互信息的变分界相联系,以放宽探测设计,从而将线性探测与微调等价。随后,通过该数学框架研究探测的实证行为与实践。我们分析了逐层性能曲线呈凸性的现象——这似乎违背了数据处理不等式。然而,我们证明由于可分离性提升与互信息降低之间的权衡,中间表示可能具有最大的互信息估计值。我们进一步提出线性可分表示的间隔可作为衡量"表示优劣"的准则。同时将准确率与互信息作为度量标准进行比较。最后,通过观察自监督语音模型在保留词汇与音素信息上的表现,实证验证了我们的观点。