Visualizing data is often a crucial first step in data analytics workflows, but growing data sizes pose challenges due to computational and visual perception limitations. As a result, data analysts commonly down-sample their data and work with subsets. Deriving representative samples, however, remains a challenge. This paper focuses on scatterplots, a widely-used visualization type, and introduces a novel sampling objective -- perception-awareness -- aiming to improve sample efficacy by targeting humans' perception of a visualization. We make the following contributions: (1) We propose perception-augmented databases and design PAwS: a novel perception-aware sampling method for scatterplots that leverages saliency maps -- a computer vision tool for predicting areas of attention focus in visualizations -- and models perception-awareness via saliency, density, and coverage objectives. (2) We design ApproPAwS: a fast, perception-aware method for approximate visualizations, which exploits the fact that small visual perturbations are often imperceptible to humans. (3) We introduce the concept of perceptual similarity as a metric for sample quality, and present a novel method that compares saliency maps to measure it. (4) Our extensive experimental evaluation shows that our methods consistently outperform prior art in producing samples with high perceptual similarity, while ApproPAwS achieves up to 100x speed-ups with minimal loss in visual fidelity. Our user study shows that PAwS is often preferred by humans, validating our quantitative findings.
翻译:数据可视化通常是数据分析工作流中的关键第一步,但不断增长的数据规模因计算和视觉感知限制而带来挑战。因此,数据分析师通常会对数据进行降采样并处理子集。然而,如何获得具有代表性的样本仍然是一个难题。本文聚焦于广泛使用的可视化类型——散点图,提出了一种新颖的采样目标——感知感知性,旨在通过针对人类对可视化的感知来提高样本效能。我们的贡献包括:(1) 我们提出感知增强数据库,并设计了PAwS:一种新颖的感知感知散点图采样方法,该方法利用显著性图(一种用于预测可视化中注意力焦点区域的计算机视觉工具),并通过显著性、密度和覆盖度目标来建模感知感知性。(2) 我们设计了ApproPAwS:一种用于近似可视化的快速感知感知方法,该方法利用了人类通常难以察觉微小视觉扰动这一事实。(3) 我们引入了感知相似性的概念作为样本质量的度量指标,并提出了一种通过比较显著性图来测量感知相似性的新方法。(4) 我们广泛的实验评估表明,我们的方法在生成具有高感知相似性的样本方面始终优于现有技术,同时ApproPAwS在视觉保真度损失最小的情况下实现了高达100倍的加速。我们的用户研究表明,PAwS通常更受人类青睐,这验证了我们的定量研究结果。