Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the problem of computing the TV distance of two product distributions over the domain $\{0,1\}^n$. In particular, we establish the following results. 1. The problem of exactly computing the TV distance of two product distributions is $\#\mathsf{P}$-complete. This is in stark contrast with other distance measures such as KL, Chi-square, and Hellinger which tensorize over the marginals leading to efficient algorithms. 2. There is a fully polynomial-time deterministic approximation scheme (FPTAS) for computing the TV distance of two product distributions $P$ and $Q$ where $Q$ is the uniform distribution. This result is extended to the case where $Q$ has a constant number of distinct marginals. In contrast, we show that when $P$ and $Q$ are Bayes net distributions, the relative approximation of their TV distance is $\mathsf{NP}$-hard.
翻译:全变差距离(TV距离)是衡量概率分布之间距离的基本概念。本文引入并研究了在域$\{0,1\}^n$上计算两个乘积分布的全变差距离的问题。具体而言,我们建立了以下结果:1. 精确计算两个乘积分布的全变差距离问题是$\#\mathsf{P}$-完全的。这与KL散度、卡方距离和Hellinger距离等其他距离度量形成鲜明对比——这些度量可通过边缘分布的张量化实现高效算法。2. 对于两个乘积分布$P$和$Q$(其中$Q$为均匀分布),存在一种完全多项式时间确定性近似方案(FPTAS)用于计算其全变差距离。该结果被推广至$Q$具有常数个不同边际分布的情形。相比之下,我们证明当$P$和$Q$为贝叶斯网络分布时,其全变差距离的相对近似问题是$\mathsf{NP}$-难的。