With the increasing availability of high-dimensional data, analysts often rely on exploratory data analysis to understand complex data sets. A key approach to exploring such data is dimensionality reduction, which embeds high-dimensional data in two dimensions to enable visual exploration. However, popular embedding techniques, such as t-SNE and UMAP, typically assume that data points are independent. When this assumption is violated, as in time-series data, the resulting visualizations may fail to reveal important temporal patterns and trends. To address this, we propose a formal extension to existing dimensionality reduction methods that incorporates two temporal loss terms that explicitly highlight temporal progression in the embedded visualizations. Through a series of experiments on both synthetic and real-world datasets, we demonstrate that our approach effectively uncovers temporal patterns and improves the interpretability of the visualizations. Furthermore, the method improves temporal coherence while preserving the fidelity of the embeddings, providing a robust tool for dynamic data analysis.
翻译:随着高维数据的日益普及,分析人员通常依赖探索性数据分析来理解复杂数据集。探索此类数据的关键方法是降维,即将高维数据嵌入二维空间以实现可视化探索。然而,流行的嵌入技术(如t-SNE和UMAP)通常假设数据点是独立的。当这一假设不成立时(例如在时间序列数据中),所得可视化可能无法揭示重要的时间模式和趋势。为解决这一问题,我们提出对现有降维方法的正式扩展,该扩展引入了两个时间损失项,以在嵌入可视化中明确突显时间进程。通过对合成数据集和真实数据集的一系列实验,我们证明该方法能有效揭示时间模式并提升可视化的可解释性。此外,该方法在保持嵌入保真度的同时增强了时间连贯性,为动态数据分析提供了一个稳健的工具。