Dimensionality reduction (DR) plays a vital role in the visual analysis of high-dimensional data. One main aim of DR is to reveal hidden patterns that lie on intrinsic low-dimensional manifolds. However, DR often overlooks important patterns when the manifolds are distorted or masked by certain influential data attributes. This paper presents a feature learning framework, FEALM, designed to generate a set of optimized data projections for nonlinear DR in order to capture important patterns in the hidden manifolds. These projections produce maximally different nearest-neighbor graphs so that resultant DR outcomes are significantly different. To achieve such a capability, we design an optimization algorithm as well as introduce a new graph dissimilarity measure, named neighbor-shape dissimilarity. Additionally, we develop interactive visualizations to assist comparison of obtained DR results and interpretation of each DR result. We demonstrate FEALM's effectiveness through experiments and case studies using synthetic and real-world datasets.
翻译:降维在高维数据的可视化分析中起着至关重要的作用。降维的主要目标之一是揭示位于内在低维流形中的隐藏模式。然而,当流形被某些影响性数据属性扭曲或掩盖时,降维往往会忽略重要模式。本文提出了一种名为FEALM的特征学习框架,该框架旨在为非线性降维生成一组优化的数据投影,从而捕获隐藏流形中的重要模式。这些投影能够产生最大差异的最近邻图,使得最终的降维结果显著不同。为实现这一能力,我们设计了一种优化算法,并引入了一种新的图相异度度量——邻近形状相异度。此外,我们开发了交互式可视化工具,以辅助比较所获得的降维结果并解释每个降维结果。通过使用合成数据集和真实数据集的实验与案例研究,我们展示了FEALM的有效性。