The Information Bottleneck (IB) is a method of lossy compression of relevant information. Its rate-distortion (RD) curve describes the fundamental tradeoff between input compression and the preservation of relevant information embedded in the input. However, it conceals the underlying dynamics of optimal input encodings. We argue that these typically follow a piecewise smooth trajectory when input information is being compressed, as recently shown in RD. These smooth dynamics are interrupted when an optimal encoding changes qualitatively, at a bifurcation. By leveraging the IB's intimate relations with RD, we provide substantial insights into its solution structure, highlighting caveats in its finite-dimensional treatments. Sub-optimal solutions are seen to collide or exchange optimality at its bifurcations. Despite the acceptance of the IB and its applications, there are surprisingly few techniques to solve it numerically, even for finite problems whose distribution is known. We derive anew the IB's first-order Ordinary Differential Equation, which describes the dynamics underlying its optimal tradeoff curve. To exploit these dynamics, we not only detect IB bifurcations but also identify their type in order to handle them accordingly. Rather than approaching the IB's optimal curve from sub-optimal directions, the latter allows us to follow a solution's trajectory along the optimal curve under mild assumptions. We thereby translate an understanding of IB bifurcations into a surprisingly accurate numerical algorithm.
翻译:信息瓶颈(Information Bottleneck, IB)是一种对相关信息进行有损压缩的方法。其率失真(rate-distortion, RD)曲线描述了输入压缩与保留输入中嵌入的相关信息之间的基本权衡。然而,该曲线掩盖了最优输入编码的潜在动态。我们认为,正如最近在RD研究中所示,当输入信息被压缩时,这些动态通常遵循分段平滑的轨迹。当最优编码在分岔点发生质变时,这些平滑动态会被中断。通过利用IB与RD的密切关系,我们深入解析了其解结构,并指出了其有限维处理中的潜在陷阱。次优解会在分岔点处发生碰撞或交换最优性。尽管IB及其应用已被广泛接受,但令人惊讶的是,即使对于分布已知的有限问题,求解其数值的方法也极为稀少。我们重新推导了IB的一阶常微分方程,该方程描述了其最优权衡曲线背后的动态。为了利用这些动态,我们不仅检测IB分岔点,还识别其类型以进行相应处理。这使得我们无需从次优方向逼近IB的最优曲线,而是在温和假设下沿最优曲线追踪解的轨迹。由此,我们将对IB分岔的理解转化为一个精度出奇高的数值算法。