Dimensionality reduction (DR) techniques help analysts to understand patterns in high-dimensional spaces. These techniques, often represented by scatter plots, are employed in diverse science domains and facilitate similarity analysis among clusters and data samples. For datasets containing many granularities or when analysis follows the information visualization mantra, hierarchical DR techniques are the most suitable approach since they present major structures beforehand and details on demand. This work presents HUMAP, a novel hierarchical dimensionality reduction technique designed to be flexible on preserving local and global structures and preserve the mental map throughout hierarchical exploration. We provide empirical evidence of our technique's superiority compared with current hierarchical approaches and show a case study applying HUMAP for dataset labelling.
翻译:降维技术有助于分析人员理解高维空间中的模式。这些通常以散点图表示的技术被广泛应用于不同科学领域,能够促进对聚类与数据样本间相似性的分析。对于包含多粒度特征的数据集,或当分析遵循信息可视化准则时,层次化降维技术是最合适的方法,因其能够预先呈现主要结构并按需展示细节。本研究提出HUMAP——一种新颖的层次化降维技术,其设计特点在于能够灵活保持局部与全局结构,并在层次化探索过程中维持用户心智图景。我们通过实证数据证明了该技术相较于现有层次化方法的优越性,并通过案例研究展示了HUMAP在数据集标注中的应用。