The partial information decomposition (PID) aims to quantify the amount of redundant information that a set of sources provides about a target. Here, we show that this goal can be formulated as a type of information bottleneck (IB) problem, termed the "redundancy bottleneck" (RB). The RB formalizes a tradeoff between prediction and compression: it extracts information from the sources that best predict the target, without revealing which source provided the information. It can be understood as a generalization of "Blackwell redundancy", which we previously proposed as a principled measure of PID redundancy. The "RB curve" quantifies the prediction--compression tradeoff at multiple scales. This curve can also be quantified for individual sources, allowing subsets of redundant sources to be identified without combinatorial optimization. We provide an efficient iterative algorithm for computing the RB curve.
翻译:部分信息分解旨在量化一组信源关于目标所提供的冗余信息量。本文证明该目标可表述为一类信息瓶颈问题,称为“冗余瓶颈”。冗余瓶颈形式化地描述了预测与压缩之间的权衡:它从信源中提取最能预测目标的信息,同时不揭示信息由哪个信源提供。可将其理解为“布莱克韦尔冗余”的推广形式——我们先前提出将该度量作为部分信息分解冗余性的原理性测度。“冗余瓶颈曲线”在多个尺度上量化预测与压缩的权衡关系。该曲线亦可针对单个信源进行量化,从而无需组合优化即可识别冗余信源子集。我们提供了一种计算冗余瓶颈曲线的高效迭代算法。