The Information Bottleneck (IB) principle has emerged as a promising approach for enhancing the generalization, robustness, and interpretability of deep neural networks, demonstrating efficacy across image segmentation, document clustering, and semantic communication. Among IB implementations, the IB Lagrangian method, employing Lagrangian multipliers, is widely adopted. While numerous methods for the optimizations of IB Lagrangian based on variational bounds and neural estimators are feasible, their performance is highly dependent on the quality of their design, which is inherently prone to errors. To address this limitation, we introduce Structured IB, a framework for investigating potential structured features. By incorporating auxiliary encoders to extract missing informative features, we generate more informative representations. Our experiments demonstrate superior prediction accuracy and task-relevant information preservation compared to the original IB Lagrangian method, even with reduced network size.
翻译:信息瓶颈(IB)原理已成为增强深度神经网络泛化能力、鲁棒性和可解释性的有效方法,在图像分割、文档聚类和语义通信等领域展现出显著效果。在IB的实现方案中,基于拉格朗日乘子的IB拉格朗日方法被广泛采用。尽管目前存在多种基于变分边界和神经估计器的IB拉格朗日优化方法,但其性能高度依赖于设计方案的质量,而设计过程本身易产生误差。为克服这一局限,本文提出结构化信息瓶颈(Structured IB)框架,用于探索潜在的结构化特征。通过引入辅助编码器来提取缺失的信息特征,我们能够生成更具信息量的表征。实验结果表明,即使在网络规模缩减的情况下,该方法相较于原始IB拉格朗日方法仍能实现更优的预测精度和任务相关信息保留能力。