Visual quality measures (VQMs) are designed to support analysts by automatically detecting and quantifying patterns in visualizations. We propose a new VQM for visual grouping patterns in scatterplots, called ClustML, which is trained on previously collected human subject judgments. Our model encodes scatterplots in the parametric space of a Gaussian Mixture Model and uses a classifier trained on human judgment data to estimate the perceptual complexity of grouping patterns. The numbers of initial mixture components and final combined groups. It improves on existing VQMs, first, by better estimating human judgments on two-Gaussian cluster patterns and, second, by giving higher accuracy when ranking general cluster patterns in scatterplots. We use it to analyze kinship data for genome-wide association studies, in which experts rely on the visual analysis of large sets of scatterplots. We make the benchmark datasets and the new VQM available for practical use and further improvements.
翻译:视觉质量度量(VQM)旨在通过自动检测和量化可视化中的模式来支持分析人员。我们提出了一种新的VQM,用于散点图中的视觉分组模式,称为ClustML,该度量基于先前收集的人类主观判断数据进行训练。我们的模型在混合高斯模型的参数空间中编码散点图,并利用基于人类判断数据训练的分类器来估计分组模式的感知复杂度。初始混合分量和最终组合分组的数量均可由此得出。与现有VQM相比,该模型的改进体现在两方面:首先,它能够更准确地估计人对双高斯聚类模式的判断;其次,在散点图通用聚类模式的排序中具有更高的准确率。我们将其应用于全基因组关联研究的亲缘关系数据分析,该领域专家需依赖大规模散点图的视觉分析。我们公开了基准数据集和新提出的VQM,以支持实际应用与进一步改进。