When visualizing a high-dimensional dataset, dimension reduction techniques are commonly employed which provide a single 2 dimensional view of the data. We describe ENS-t-SNE: an algorithm for Embedding Neighborhoods Simultaneously that generalizes the t-Stochastic Neighborhood Embedding approach. By using different viewpoints in ENS-t-SNE's 3D embedding, one can visualize different types of clusters within the same high-dimensional dataset. This enables the viewer to see and keep track of the different types of clusters, which is harder to do when providing multiple 2D embeddings, where corresponding points cannot be easily identified. We illustrate the utility of ENS-t-SNE with real-world applications and provide an extensive quantitative evaluation with datasets of different types and sizes.
翻译:在可视化高维数据集时,通常采用降维技术来提供数据的单一二维视图。我们提出ENS-t-SNE:一种同时嵌入邻域的算法,它推广了t-分布随机邻域嵌入方法。通过利用ENS-t-SNE三维嵌入中的不同视角,可以可视化同一高维数据集中的不同类型聚类。这使得观察者能够观测并追踪不同类型的聚类,而提供多个二维嵌入时则难以实现这一点,因为对应的数据点不易识别。我们通过实际应用展示了ENS-t-SNE的实用性,并针对不同类型和规模的数据集进行了全面的定量评估。