Probabilistic proofs of the Johnson-Lindenstrauss lemma imply that random projection can reduce the dimension of a data set and approximately preserve pairwise distances. If a distance being approximately preserved is called a success, and the complement of this event is called a failure, then such a random projection likely results in no failures. Assuming a Gaussian random projection, the lemma is proved by showing that the no-failure probability is positive using a combination of Bonferroni's inequality and Markov's inequality. This paper modifies this proof in two ways to obtain a greater lower bound on the no-failure probability. First, Bonferroni's inequality is applied to pairs of failures instead of individual failures. Second, since a pair of projection errors has a bivariate gamma distribution, the probability of a pair of successes is bounded using an inequality from Jensen (1969). If $n$ is the number of points to be embedded and $\mu$ is the probability of a success, then this leads to an increase in the lower bound on the no-failure probability of $\frac{1}{2}\binom{n}{2}(1-\mu)^2$ if $\binom{n}{2}$ is even and $\frac{1}{2}\left(\binom{n}{2}-1\right)(1-\mu)^2$ if $\binom{n}{2}$ is odd. For example, if $n=10^5$ points are to be embedded in $k=10^4$ dimensions with a tolerance of $\epsilon=0.1$, then the improvement in the lower bound is on the order of $10^{-14}$. We also show that further improvement is possible if the inequality in Jensen (1969) extends to three successes, though we do not have a proof of this result.
翻译:Johnson-Lindenstrauss引理的概率证明表明,随机投影能够降低数据集的维度并近似保持点对距离。若将某一距离被近似保持称为一次成功,而该事件的补集称为一次失败,则此类随机投影很可能不会产生任何失败。在高斯随机投影的假设下,该引理通过结合Bonferroni不等式与Markov不等式证明无失败概率为正。本文从两个方面修改此证明,以获得无失败概率的更大下界。首先,将Bonferroni不等式应用于失败事件对而非单个失败事件。其次,由于投影误差对服从双变量伽马分布,我们使用Jensen(1969)中的不等式来界定一对成功事件的概率。设$n$为待嵌入点的数量,$\mu$为单次成功概率,则该方法可将无失败概率的下界提升$\frac{1}{2}\binom{n}{2}(1-\mu)^2$(当$\binom{n}{2}$为偶数时)或$\frac{1}{2}\left(\binom{n}{2}-1\right)(1-\mu)^2$(当$\binom{n}{2}$为奇数时)。例如,若将$n=10^5$个点嵌入$k=10^4$维空间且容差$\epsilon=0.1$,则下界改进量级约为$10^{-14}$。我们还证明,若Jensen(1969)中的不等式可推广至三次成功事件,则可能实现进一步改进,但尚未对此结论给出严格证明。