We present a new nonlinear dimensionality reduction method, MAPLE, that enhances UMAP by improving manifold modeling. MAPLE employs a self-supervised learning approach to more efficiently encode low-dimensional manifold geometry. Central to this approach are maximum manifold capacity representations (MMCRs), which help untangle complex manifolds by compressing variances among locally similar data points while amplifying variance among dissimilar data points. This design is particularly effective for high-dimensional data with substantial intra-cluster variance and curved manifold structures, such as biological or image data. Our qualitative and quantitative evaluations demonstrate that MAPLE can produce clearer visual cluster separations and finer subcluster resolution than UMAP while maintaining comparable computational cost.
翻译:本文提出了一种新的非线性降维方法MAPLE,该方法通过改进流形建模来增强UMAP。MAPLE采用自监督学习方法,以更高效地编码低维流形几何结构。该方法的核心是最大流形容量表示(MMCRs),它通过压缩局部相似数据点之间的方差,同时放大不相似数据点之间的方差,从而帮助解缠复杂流形。这种设计对于具有显著簇内方差和弯曲流形结构的高维数据(如生物数据或图像数据)尤为有效。我们的定性和定量评估表明,MAPLE能够在保持可比计算成本的同时,比UMAP产生更清晰的视觉聚类分离和更精细的子聚类分辨率。