We present a theoretical foundation regarding the boundedness of the t-SNE algorithm. t-SNE employs gradient descent iteration with Kullback-Leibler (KL) divergence as the objective function, aiming to identify a set of points that closely resemble the original data points in a high-dimensional space, minimizing KL divergence. Investigating t-SNE properties such as perplexity and affinity under a weak convergence assumption on the sampled dataset, we examine the behavior of points generated by t-SNE under continuous gradient flow. Demonstrating that points generated by t-SNE remain bounded, we leverage this insight to establish the existence of a minimizer for KL divergence.
翻译:我们提出了关于t-SNE算法有界性的理论基础。t-SNE采用Kullback-Leibler (KL)散度作为目标函数进行梯度下降迭代,旨在高维空间中识别一组与原始数据点高度相似的点,从而最小化KL散度。在采样数据集满足弱收敛假设的条件下,研究t-SNE的困惑度和亲和性等性质,我们考察了连续梯度流下t-SNE生成点的行为。通过证明t-SNE生成的点保持有界性,我们利用这一结论建立了KL散度最小化器存在性的理论依据。