Entropy estimation is of practical importance in information theory and statistical science. Many existing entropy estimators suffer from fast growing estimation bias with respect to dimensionality, rendering them unsuitable for high-dimensional problems. In this work we propose a transform-based method for high-dimensional entropy estimation, which consists of the following two main ingredients. First by modifying the k-NN based entropy estimator, we propose a new estimator which enjoys small estimation bias for samples that are close to a uniform distribution. Second we design a normalizing flow based mapping that pushes samples toward a uniform distribution, and the relation between the entropy of the original samples and the transformed ones is also derived. As a result the entropy of a given set of samples is estimated by first transforming them toward a uniform distribution and then applying the proposed estimator to the transformed samples. The performance of the proposed method is compared against several existing entropy estimators, with both mathematical examples and real-world applications.
翻译:熵估计在信息论和统计科学中具有重要的实际意义。许多现有的熵估计器存在估计偏差随维度快速增长的问题,使其不适用于高维问题。本文提出了一种基于变换的高维熵估计方法,该方法包含以下两个主要组成部分。首先,通过修改基于k-NN的熵估计器,我们提出了一种新估计器,该估计器对于接近均匀分布的样本具有较小的估计偏差。其次,我们设计了一种基于归一化流的映射,将样本推向均匀分布,并推导出原始样本熵与变换后样本熵之间的关系。因此,给定样本集的熵通过先将其变换为接近均匀分布,再将所提出的估计器应用于变换后的样本来估计。通过数学实例和实际应用,将所提方法的性能与几种现有的熵估计器进行了比较。