Probabilistic proofs of the Johnson-Lindenstrauss lemma imply that random projection can reduce the dimension of a data set and approximately preserve pairwise distances. If a distance being approximately preserved is called a success, and the complement of this event is called a failure, then such a random projection likely results in no failures. Assuming a Gaussian random projection, the lemma is proved by showing that the no-failure probability is positive using a combination of Bonferroni's inequality and Markov's inequality. This paper modifies this proof in two ways to obtain a greater lower bound on the no-failure probability. First, Bonferroni's inequality is applied to pairs of failures instead of individual failures. Second, since a pair of projection errors has a bivariate gamma distribution, the probability of a pair of successes is bounded using an inequality from Jensen (1969). If $n$ is the number of points to be embedded and $\mu$ is the probability of a success, then this leads to an increase in the lower bound on the no-failure probability of $\frac{1}{2}\binom{n}{2}(1-\mu)^2$ if $\binom{n}{2}$ is even and $\frac{1}{2}\left(\binom{n}{2}-1\right)(1-\mu)^2$ if $\binom{n}{2}$ is odd. For example, if $n=10^5$ points are to be embedded in $k=10^4$ dimensions with a tolerance of $\epsilon=0.1$, then the improvement in the lower bound is on the order of $10^{-14}$. We also show that further improvement is possible if the inequality in Jensen (1969) extends to three successes, though we do not have a proof of this result.
翻译:Johnson-Lindenstrauss引理的概率证明表明,随机投影能够降低数据集的维度并近似保持成对距离。若近似保持距离称为成功事件,其补集称为失败事件,则此类随机投影很可能不会导致任何失败。在假设高斯随机投影的情况下,该引理通过结合Bonferroni不等式和Markov不等式证明无失败概率为正。本文对此证明进行了两处修改,以获得更大的无失败概率下界。首先,将Bonferroni不等式应用于成对失败而非单个失败。其次,由于一对投影误差服从双变量伽玛分布,利用Jensen(1969)中的不等式对成功事件对的概率进行界定。若$n$为待嵌入点数且$\mu$为单次成功概率,则当$\binom{n}{2}$为偶数时,无失败概率下界的提升量为$\frac{1}{2}\binom{n}{2}(1-\mu)^2$;当$\binom{n}{2}$为奇数时,提升量为$\frac{1}{2}\left(\binom{n}{2}-1\right)(1-\mu)^2$。例如,若将$n=10^5$个点嵌入到$k=10^4$维空间中且容差为$\epsilon=0.1$,则下界改进量级可达$10^{-14}$。本文还指出,若Jensen(1969)中的不等式可扩展至三个成功事件,则可能实现进一步改进,但目前尚未证明该结论。