We present S+t-SNE, an adaptation of the t-SNE algorithm designed to handle infinite data streams. The core idea behind S+t-SNE is to update the t-SNE embedding incrementally as new data arrives, ensuring scalability and adaptability to handle streaming scenarios. By selecting the most important points at each step, the algorithm ensures scalability while keeping informative visualisations. Employing a blind method for drift management adjusts the embedding space, facilitating continuous visualisation of evolving data dynamics. Our experimental evaluations demonstrate the effectiveness and efficiency of S+t-SNE. The results highlight its ability to capture patterns in a streaming scenario. We hope our approach offers researchers and practitioners a real-time tool for understanding and interpreting high-dimensional data.
翻译:我们提出S+t-SNE,这是为处理无限数据流而设计的t-SNE算法改进版。其核心思想在于随新数据到达而增量式更新t-SNE嵌入,确保流式场景下的可扩展性与适应性。通过每步选取最关键数据点,该算法在保证可视化信息度的同时维持可扩展性。采用无漂移管理方法调整嵌入空间,可连续可视化动态演化的数据特征。实验评估证明了S+t-SNE的有效性与高效性,结果凸显了其在流式场景中捕捉模式的能力。我们期待该方法能为研究人员与实践者提供实时理解与解释高维数据的工具。