Shannon entropy is a polymatroidal set function and lies at the foundation of information theory, yet the class of entropic polymatroids is strictly smaller than the class of all submodular functions. In parallel, submodular and combinatorial information measures (SIMs) have recently been proposed as a principled framework for extending entropy, mutual information, and conditional mutual information to general submodular functions, and have been used extensively in data subset selection, active learning, domain adaptation, and representation learning. This raises a natural and fundamental question: are the monotone submodular functions most commonly used in practice entropic? In this paper, we answer this question in the affirmative for a broad class of widely used polymatroid functions. We provide explicit entropic constructions for set cover and coverage functions, facility location, saturated coverage, concave-over-modular functions via truncations, and monotone graph-cut-type objectives. Our results show that these functions can be realized exactly as Shannon entropies of appropriately constructed random variables. As a consequence, for these functions, submodular mutual information coincides with classical mutual information, conditional gain specializes to conditional entropy, and submodular conditional mutual information reduces to standard conditional mutual information in the entropic sense. These results establish a direct bridge between combinatorial information measures and classical information theory for many of the most common submodular objectives used in applications.
翻译:香农熵是一种多拟阵集函数,构成了信息论的基础,然而熵多拟阵类严格小于所有子模函数类。与此同时,子模与组合信息测度(SIMs)最近被提出作为将熵、互信息和条件互信息扩展至一般子模函数的理论框架,并已广泛应用于数据子集选择、主动学习、域适应和表示学习。这引发了一个自然而根本的问题:实践中最常用的单调子模函数是否具有熵性?本文针对一类广泛使用的多拟阵函数给出了肯定回答。我们为集合覆盖与覆盖函数、设施选址、饱和覆盖、通过截断实现的凹覆盖模函数,以及单调图割型目标函数提供了显式熵构造。我们的结果表明,这些函数可以通过适当构造的随机变量的香农熵精确实现。因此,对于这些函数,子模互信息与经典互信息完全一致,条件增益特化为条件熵,而子模条件互信息在熵意义上可简化为标准条件互信息。这些结果为应用中最常见的子模目标函数建立了组合信息测度与经典信息理论之间的直接桥梁。