Classical nonlinear dimensionality reduction (NLDR) techniques like t-SNE, Isomap, and LLE excel at creating low-dimensional embeddings for data visualization but fundamentally lack the ability to map these embeddings back to the original high-dimensional space. This one-way transformation limits their use in generative applications. This paper addresses this critical gap by introducing a system- atic framework for constructing neural decoder architectures for prominent NLDR methods, enabling bidirectional mapping for the first time. We extend this framework by implementing a diffusion-based generative process that operates directly within these learned manifold spaces. Through experiments on the CelebA dataset, we evaluate the reconstruction and generative performance of our approach against autoencoder and standard diffusion model baselines. Our findings reveal a fundamental trade- off: while the decoders successfully reconstruct data, their quality is surpassed by end-to-end optimized autoencoders. Moreover, manifold-constrained diffusion yields poor-quality samples, suggesting that the discrete and sparse nature of classical NLDR embeddings is ill-suited for the continuous inter- polation required by generative models. This work highlights the inherent challenges in retrofitting generative capabilities onto NLDR methods designed primarily for visualization and analysis.
翻译:经典的t-SNE、Isomap和LLE等非线性降维技术虽擅长创建用于数据可视化的低维嵌入,但本质上缺乏将这些嵌入映射回原始高维空间的能力。这种单向变换限制了它们在生成式应用中的使用。本文通过为重要非线性降维方法构建神经解码器架构的系统性框架,首次实现了双向映射,从而解决了这一关键缺陷。我们进一步扩展该框架,实现了直接在习得的流形空间内运行的基于扩散的生成过程。通过在CelebA数据集上的实验,我们将所提方法的重建与生成性能与自编码器及标准扩散模型基线进行比较。研究结果揭示了一个根本性的权衡:虽然解码器能成功重建数据,但其质量被端到端优化的自编码器超越。此外,流形约束扩散生成的样本质量较差,这表明经典非线性降维嵌入的离散稀疏特性不适合生成模型所需的连续插值。这项工作凸显了为主要面向可视化与分析设计的非线性降维方法添加生成能力所面临的内在挑战。