Deep Neural Nets (DNNs) learn latent representations induced by their downstream task, objective function, and other parameters. The quality of the learned representations impacts the DNN's generalization ability and the coherence of the emerging latent space. The Information Bottleneck (IB) provides a hypothetically optimal framework for data modeling, yet it is often intractable. Recent efforts combined DNNs with the IB by applying VAE-inspired variational methods to approximate bounds on mutual information, resulting in improved robustness to adversarial attacks. This work introduces a new and tighter variational bound for the IB, improving performance of previous IB-inspired DNNs. These advancements strengthen the case for the IB and its variational approximations as a data modeling framework, and provide a simple method to significantly enhance the adversarial robustness of classifier DNNs.
翻译:深度神经网络(DNNs)通过其下游任务、目标函数及其他参数学习潜在的表示。学习到的表示质量影响DNN的泛化能力以及潜在空间的一致性。信息瓶颈(IB)为数据建模提供了一个假设性的最优框架,但通常难以求解。近年来,研究者通过应用基于变分自编码器(VAE)的变分方法近似互信息的界,将DNN与IB相结合,从而提升了模型对对抗攻击的鲁棒性。本文针对IB提出了一种新的、更紧的变分界,改进了以往基于IB的DNNs的性能。这些进展强化了IB及其变分近似作为数据建模框架的合理性,并提供了一种简单方法来显著提升分类器DNN的对抗鲁棒性。