Real-world event sequences are often complex and heterogeneous, making it difficult to create meaningful visualizations using simple data aggregation and visual encoding techniques. Consequently, visualization researchers have developed numerous visual summarization techniques to generate concise overviews of sequential data. These techniques vary widely in terms of summary structures and contents, and currently there is a knowledge gap in understanding the effectiveness of these techniques. In this work, we present the design and results of an insight-based crowdsourcing experiment evaluating three existing visual summarization techniques: CoreFlow, SentenTree, and Sequence Synopsis. We compare the visual summaries generated by these techniques across three tasks, on six datasets, at six levels of granularity. We analyze the effects of these variables on summary quality as rated by participants and completion time of the experiment tasks. Our analysis shows that Sequence Synopsis produces the highest-quality visual summaries for all three tasks, but understanding Sequence Synopsis results also takes the longest time. We also find that the participants evaluate visual summary quality based on two aspects: content and interpretability. We discuss the implications of our findings on developing and evaluating new visual summarization techniques.
翻译:现实世界中的事件序列往往复杂且异构,使得通过简单数据聚合和可视化编码技术难以生成有意义的可视化结果。为此,可视化研究人员开发了多种可视化摘要技术,用于生成序列数据的简洁概览。这些技术在摘要结构和内容上差异显著,目前对其有效性的理解存在知识空白。本研究设计并开展了一项基于洞察的众包实验,评估三种现有可视化摘要技术:CoreFlow、SentenTree和Sequence Synopsis。我们比较了这些技术在三个任务、六个数据集和六个粒度级别上生成的可视化摘要,分析了这些变量对参与者评定的摘要质量和实验任务完成时间的影响。分析表明,Sequence Synopsis在所有三个任务中均能生成最高质量的可视化摘要,但理解其结果的耗时也最长。此外,发现参与者从内容与可解释性两个维度评估可视化摘要质量。本文讨论了这些发现对开发与评估新型可视化摘要技术的启示。