Dimensionality reduction algorithms are often used to visualise high-dimensional data. Previously, studies have used prior information to enhance or suppress expected patterns in projections. In this paper, we adapt such techniques for domain knowledge guided interactive exploration. Inspired by Mapper and STAD, we present three types of lens functions for UMAP, a state-of-the-art dimensionality reduction algorithm. Lens functions enable analysts to adapt projections to their questions, revealing otherwise hidden patterns. They filter the modelled connectivity to explore the interaction between manually selected features and the data's structure, creating configurable perspectives each potentially revealing new insights. The effectiveness of the lens functions is demonstrated in two use cases and their computational cost is analysed in a synthetic benchmark. Our implementation is available in an open-source Python package: https://github.com/vda-lab/lensed_umap.
翻译:降维算法常用于可视化高维数据。此前,已有研究利用先验信息增强或抑制投影中预期模式。本文针对领域知识引导的交互式探索,对相关技术进行适配。受Mapper和STAD启发,我们为前沿降维算法UMAP设计了三类透镜函数。透镜函数使分析人员能够根据研究问题调整投影,揭示原本隐藏的规律。这些函数通过过滤建模的连通性,探索人工选取特征与数据结构之间的交互,生成可配置的视角,每种视角都可能带来新的洞见。我们在两个应用案例中验证了透镜函数的有效性,并通过合成基准测试分析了其计算成本。相关实现已在开源Python包中发布:https://github.com/vda-lab/lensed_umap。