Given a non-negative $n \times n$ matrix viewed as a set of distances between $n$ points, we consider the property testing problem of deciding if it is a metric. We also consider the same problem for two special classes of metrics, tree metrics and ultrametrics. For general metrics, our paper is the first to consider these questions. We prove an upper bound of $O(n^{2/3}/\epsilon^{4/3})$ on the query complexity for this problem. Our algorithm is simple, but the analysis requires great care in bounding the variance on the number of violating triangles in a sample. When $\epsilon$ is a slowly decreasing function of $n$ (rather than a constant, as is standard), we prove a lower bound of matching dependence on $n$ of $\Omega (n^{2/3})$, ruling out any property testers with $o(n^{2/3})$ query complexity unless their dependence on $1/\epsilon$ is super-polynomial. Next, we turn to tree metrics and ultrametrics. While there were known upper and lower bounds, we considerably improve these bounds showing essentially tight bounds of $\tilde{O}(1/\epsilon )$ on the sample complexity. We also show a lower bound of $\Omega ( 1/\epsilon^{4/3} )$ on the query complexity. Our upper bounds are derived by doing a more careful analysis of a natural, simple algorithm. For the lower bounds, we construct distributions on NO instances, where it is hard to find a witness showing that these are not ultrametrics.
翻译:给定一个非负的 $n \times n$ 矩阵,视作 $n$ 个点之间的距离集合,我们考虑判定其是否为一个度量的性质测试问题。我们还针对两类特殊的度量——树度量和超度量——研究了相同的问题。对于一般度量,本文是首次探讨这些问题。我们证明了该问题的查询复杂度上界为 $O(n^{2/3}/\epsilon^{4/3})$。我们的算法简单,但分析过程需要极其谨慎地界定样本中违规三角形数量的方差。当 $\epsilon$ 是 $n$ 的缓减函数(而非标准情况下的常数)时,我们证明了下界对 $n$ 的依赖关系为 $\Omega (n^{2/3})$,这排除了任何查询复杂度为 $o(n^{2/3})$ 的性质测试器,除非其对 $1/\epsilon$ 的依赖是超多项式的。接下来,我们转向树度量和超度量。尽管已有已知的上界和下界,但我们显著改进了这些界,证明了样本复杂度本质上紧的界为 $\tilde{O}(1/\epsilon )$。我们还证明了查询复杂度的下界为 $\Omega ( 1/\epsilon^{4/3} )$。我们的上界是通过对一个自然、简单的算法进行更精细的分析得到的。对于下界,我们构造了 NO 实例的分布,其中难以找到证明这些实例不是超度量的证据。