The Kullback-Leibler (KL) divergence is frequently used in data science. For discrete distributions on large state spaces, approximations of probability vectors may result in a few small negative entries, rendering the KL divergence undefined. We address this problem by introducing a parameterized family of substitute divergence measures, the shifted KL (sKL) divergence measures. Our approach is generic and does not increase the computational overhead. We show that the sKL divergence shares important theoretical properties with the KL divergence and discuss how its shift parameters should be chosen. If Gaussian noise is added to a probability vector, we prove that the average sKL divergence converges to the KL divergence for small enough noise. We also show that our method solves the problem of negative entries in an application from computational oncology, the optimization of Mutual Hazard Networks for cancer progression using tensor-train approximations.
翻译:Kullback-Leibler (KL) 散度在数据科学中广泛应用。对于大规模状态空间上的离散分布,概率向量的近似可能导致少量负值条目,使得KL散度无法定义。我们通过引入一类带参数的替代散度度量——移位KL (sKL) 散度度量来解决该问题。该方法具有通用性,且不增加计算开销。我们证明了sKL散度与KL散度共享重要理论性质,并讨论了其移位参数的选择准则。当概率向量添加高斯噪声时,我们证明在噪声足够小的条件下,平均sKL散度收敛至KL散度。我们进一步表明,该方法解决了计算肿瘤学应用中的负值条目问题,即利用张量列近似优化癌症进展的互惠风险网络。