One of the fundamental steps toward understanding a complex system is identifying variation at the scale of the system's components that is most relevant to behavior on a macroscopic scale. Mutual information provides a natural means of linking variation across scales of a system due to its independence of functional relationship between observables. However, characterizing the manner in which information is distributed across a set of observables is computationally challenging and generally infeasible beyond a handful of measurements. Here we propose a practical and general methodology that uses machine learning to decompose the information contained in a set of measurements by jointly optimizing a lossy compression of each measurement. Guided by the distributed information bottleneck as a learning objective, the information decomposition identifies the variation in the measurements of the system state most relevant to specified macroscale behavior. We focus our analysis on two paradigmatic complex systems: a Boolean circuit and an amorphous material undergoing plastic deformation. In both examples, the large amount of entropy of the system state is decomposed, bit by bit, in terms of what is most related to macroscale behavior. The identification of meaningful variation in data, with the full generality brought by information theory, is made practical for studying the connection between micro- and macroscale structure in complex systems.
翻译:理解复杂系统的关键步骤之一,是识别系统组件尺度上对宏观行为最相关的变异。互信息因其与观测量之间函数关系的独立性,为关联系统不同尺度的变异提供了自然途径。然而,描述信息在一组观测量中的分布方式在计算上极具挑战性,且通常仅对少量测量可行。本文提出了一种实用且通用的方法,通过联合优化每个测量的有损压缩,利用机器学习分解一组测量中包含的信息。以分布式信息瓶颈作为学习目标,该信息分解方法可识别系统状态测量中与特定宏观行为最相关的变异。我们重点分析了两个典型的复杂系统:布尔电路和经历塑性变形的非晶材料。在这两个例子中,系统状态的大量熵被逐比特分解,以揭示与宏观行为最相关的成分。借助信息论带来的完全通用性,对数据中有意义变异的识别为研究复杂系统中微观与宏观结构之间的联系提供了实用途径。