GeoIB: Geometry-Aware Information Bottleneck via Statistical-Manifold Compression

Information Bottleneck (IB) is widely used, but in deep learning, it is usually implemented through tractable surrogates, such as variational bounds or neural mutual information (MI) estimators, rather than directly controlling the MI I(X;Z) itself. The looseness and estimator-dependent bias can make IB "compression" only indirectly controlled and optimization fragile. We revisit the IB problem through the lens of information geometry and propose a \textbf{Geo}metric \textbf{I}nformation \textbf{B}ottleneck (\textbf{GeoIB}) that dispenses with mutual information (MI) estimation. We show that I(X;Z) and I(Z;Y) admit exact projection forms as minimal Kullback-Leibler (KL) distances from the joint distributions to their respective independence manifolds. Guided by this view, GeoIB controls information compression with two complementary terms: (i) a distribution-level Fisher-Rao (FR) discrepancy, which matches KL to second order and is reparameterization-invariant; and (ii) a geometry-level Jacobian-Frobenius (JF) term that provides a local capacity-type upper bound on I(Z;X) by penalizing pullback volume expansion of the encoder. We further derive a natural-gradient optimizer consistent with the FR metric and prove that the standard additive natural-gradient step is first-order equivalent to the geodesic update. We conducted extensive experiments and observed that the GeoIB achieves a better trade-off between prediction accuracy and compression ratio in the information plane than the mainstream IB baselines on popular datasets. GeoIB improves invariance and optimization stability by unifying distributional and geometric regularization under a single bottleneck multiplier. The source code of GeoIB is released at "https://anonymous.4open.science/r/G-IB-0569".

翻译：信息瓶颈（IB）方法被广泛应用，但在深度学习中，其通常通过可处理的替代目标实现，例如变分界或神经互信息（MI）估计器，而非直接控制互信息 I(X;Z) 本身。这些替代目标的松弛性及估计器相关偏差可能导致 IB 的“压缩”仅被间接控制，且优化过程脆弱。本文从信息几何的视角重新审视 IB 问题，提出了一种无需互信息（MI）估计的**几何信息瓶颈（GeoIB）**。我们证明了 I(X;Z) 和 I(Z;Y) 具有精确的投影形式，即联合分布到其各自独立流形的最小 Kullback-Leibler（KL）距离。基于此视角，GeoIB 通过两个互补项控制信息压缩：(i) 分布层面的 Fisher-Rao（FR）差异，它与 KL 散度在二阶匹配且具有重参数化不变性；(ii) 几何层面的 Jacobian-Frobenius（JF）项，通过惩罚编码器的拉回体积扩张，为 I(Z;X) 提供了一个局部容量型上界。我们进一步推导了与 FR 度量一致的自然梯度优化器，并证明了标准的加性自然梯度步长在一阶意义上等价于测地线更新。我们进行了广泛的实验，观察到在多个流行数据集上，GeoIB 在信息平面中比主流 IB 基线方法实现了预测精度与压缩率之间更好的权衡。GeoIB 通过将分布正则化与几何正则化统一在单个瓶颈乘子下，提升了不变性与优化稳定性。GeoIB 的源代码发布于 "https://anonymous.4open.science/r/G-IB-0569"。