Multivariate information theory provides a general and principled framework for understanding how the components of a complex system are connected. Existing analyses are coarse in nature -- built up from characterizations of discrete subsystems -- and can be computationally prohibitive. In this work, we propose to study the continuous space of possible descriptions of a composite system as a window into its organizational structure. A description consists of specific information conveyed about each of the components, and the space of possible descriptions is equivalent to the space of lossy compression schemes of the components. We introduce a machine learning framework to optimize descriptions that extremize key information theoretic quantities used to characterize organization, such as total correlation and O-information. Through case studies on spin systems, Sudoku boards, and letter sequences from natural language, we identify extremal descriptions that reveal how system-wide variation emerges from individual components. By integrating machine learning into a fine-grained information theoretic analysis of composite random variables, our framework opens a new avenues for probing the structure of real-world complex systems.
翻译:多元信息理论为理解复杂系统各组成部分如何相互连接提供了一个通用且原则性的框架。现有分析本质上是粗粒度的——建立在对离散子系统的刻画之上——并且计算上可能代价高昂。在这项工作中,我们提出研究复合系统所有可能描述的连续空间,以此作为窥探其组织结构的一个窗口。一个描述包含关于每个组成部分所传递的特定信息,而所有可能描述的空间等价于各组成部分的有损压缩方案空间。我们引入一个机器学习框架来优化那些能使刻画组织性的关键信息论量(如总相关度和O-信息)达到极值的描述。通过对自旋系统、数独棋盘和自然语言字母序列的案例研究,我们识别出极值描述,揭示了系统层面的变化如何从个体组成部分中涌现。通过将机器学习与复合随机变量的细粒度信息论分析相结合,我们的框架为探索现实世界复杂系统的结构开辟了新的途径。