The Information Bottleneck (IB) is a method of lossy compression. Its rate-distortion (RD) curve describes the fundamental tradeoff between input compression and the preservation of relevant information. However, it conceals the underlying dynamics of optimal input encodings. We argue that these typically follow a piecewise smooth trajectory as the input information is being compressed, as recently shown in RD. These smooth dynamics are interrupted when an optimal encoding changes qualitatively, at a bifurcation. By leveraging the IB's intimate relations with RD, sub-optimal solutions can be seen to collide or exchange optimality there. Despite the acceptance of the IB and its applications, there are surprisingly few techniques to solve it numerically, even for finite problems whose distribution is known. We derive anew the IB's first-order Ordinary Differential Equation, which describes the dynamics underlying its optimal tradeoff curve. To exploit these dynamics, one needs not only to detect IB bifurcations but also to identify their type in order to handle them accordingly. Rather than approaching the optimal IB curve from sub-optimal directions, the latter allows us to follow a solution's trajectory along the optimal curve, under mild assumptions. Thereby, translating an understanding of IB bifurcations into a surprisingly accurate numerical algorithm.
翻译:信息瓶颈(IB)是一种有损压缩方法。其率失真(RD)曲线描述了输入压缩与相关信息保留之间的基本权衡。然而,它掩盖了最优输入编码的潜在动力学。我们认为,这些编码通常遵循分段平滑轨迹(如近期在RD研究中所示),当输入信息被压缩时,这些平滑动态会在分岔点处因最优编码发生质变而中断。通过利用IB与RD的密切关系,可以观察到次优解在此处碰撞或交换最优性。尽管IB及其应用已被广泛接受,但令人惊讶的是,目前鲜有技术能对其进行数值求解,即便对于分布已知的有限问题亦是如此。我们重新推导了IB的一阶常微分方程,该方程描述了其最优权衡曲线背后的动力学。要利用这些动力学,不仅需要检测IB分岔点,还需识别其类型以进行相应处理。与从次优方向逼近最优IB曲线不同,后者允许我们在温和假设下沿最优曲线追踪解的轨迹。从而,将对IB分岔的理解转化为一种出奇精确的数值算法。