Deep neural networks have demonstrated remarkable performance in supervised learning tasks but require large amounts of labeled data. Self-supervised learning offers an alternative paradigm, enabling the model to learn from data without explicit labels. Information theory has been instrumental in understanding and optimizing deep neural networks. Specifically, the information bottleneck principle has been applied to optimize the trade-off between compression and relevant information preservation in supervised settings. However, the optimal information objective in self-supervised learning remains unclear. In this paper, we review various approaches to self-supervised learning from an information-theoretic standpoint and present a unified framework that formalizes the self-supervised information-theoretic learning problem. We integrate existing research into a coherent framework, examine recent self-supervised methods, and identify research opportunities and challenges. Moreover, we discuss empirical measurement of information-theoretic quantities and their estimators. This paper offers a comprehensive review of the intersection between information theory, self-supervised learning, and deep neural networks.
翻译:深度神经网络在有监督学习任务中表现出卓越性能,但需要大量标注数据。自监督学习提供了一种替代范式,使模型能够在没有显式标签的情况下从数据中学习。信息论在理解和优化深度神经网络方面发挥了重要作用。具体而言,信息瓶颈原理已被应用于优化有监督场景中压缩与相关信息保留之间的权衡。然而,自监督学习中的最优信息目标仍不明确。本文从信息论角度综述了各类自监督学习方法,并提出了一个统一框架,形式化了自监督信息论学习问题。我们将现有研究整合到一个连贯的框架中,审视近期自监督方法,并识别研究机遇与挑战。此外,我们讨论了信息论量及其估计量的经验测量方法。本文对信息论、自监督学习与深度神经网络的交叉领域进行了全面综述。