Information diagram and the I-measure are useful mnemonics where random variables are treated as sets, and entropy and mutual information are treated as a signed measure. Although the I-measure has been successful in machine proofs of entropy inequalities, the theoretical underpinning of the ``random variables as sets'' analogy has been unclear until the recent works on mappings from random variables to sets by Ellerman (recovering order-$2$ Tsallis entropy over general probability space), and Down and Mediano (recovering Shannon entropy over discrete probability space). We generalize these constructions by designing a mapping which recovers the Shannon entropy (and the information density) over general probability space. Moreover, it has an intuitive interpretation based on the arrival time in a Poisson process, allowing us to understand the union, intersection and difference between (sets corresponding to) random variables and events. Cross entropy, KL divergence, and conditional entropy given an event, can be obtained as set intersections. We propose a generalization of the information diagram that also includes events, and demonstrate its usage by a diagrammatic proof of Fano's inequality.
翻译:信息图与I-测度是将随机变量视为集合、熵与互信息视为带符号测度的实用助记工具。尽管I-测度已成功用于熵不等式的机器证明,但"随机变量作为集合"这一类比的理论基础直到近期才得到阐明:Ellerman建立了一般概率空间上恢复二阶Tsallis熵的映射,Down与Mediano建立了离散概率空间上恢复香农熵的映射。我们通过设计一种在一般概率空间上恢复香农熵(及信息密度)的映射来推广这些构造。该映射基于泊松过程中的到达时间具有直观解释,使我们能够理解随机变量与事件(所对应集合)之间的并集、交集和差集。交叉熵、KL散度以及给定事件下的条件熵均可表示为集合交集。我们提出包含事件的信息图推广形式,并通过Fano不等式的图解证明展示其应用。