This paper presents a kernelized version of the t-SNE algorithm, capable of mapping high-dimensional data to a low-dimensional space while preserving the pairwise distances between the data points in a non-Euclidean metric. This can be achieved using a kernel trick only in the high dimensional space or in both spaces, leading to an end-to-end kernelized version. The proposed kernelized version of the t-SNE algorithm can offer new views on the relationships between data points, which can improve performance and accuracy in particular applications, such as classification problems involving kernel methods. The differences between t-SNE and its kernelized version are illustrated for several datasets, showing a neater clustering of points belonging to different classes.
翻译:本文提出了一种核化版本的t-SNE算法,能够在非欧几里得度量下保留数据点间成对距离的同时,将高维数据映射到低维空间。该算法可仅在高维空间或同时在两个空间中使用核技巧,从而形成端到端的核化版本。所提出的t-SNE算法核化版本能够提供数据点间关系的新视角,从而在特定应用(如涉及核方法的分类问题)中提升性能与精度。通过多个数据集的对比,本文展示了t-SNE与其核化版本之间的差异,结果表明不同类别的数据点呈现出更清晰的聚类效果。