Automated visualization recommendation facilitates the rapid creation of effective visualizations, which is especially beneficial for users with limited time and limited knowledge of data visualization. There is an increasing trend in leveraging machine learning (ML) techniques to achieve an end-to-end visualization recommendation. However, existing ML-based approaches implicitly assume that there is only one appropriate visualization for a specific dataset, which is often not true for real applications. Also, they often work like a black box, and are difficult for users to understand the reasons for recommending specific visualizations. To fill the research gap, we propose AdaVis, an adaptive and explainable approach to recommend one or multiple appropriate visualizations for a tabular dataset. It leverages a box embedding-based knowledge graph to well model the possible one-to-many mapping relations among different entities (i.e., data features, dataset columns, datasets, and visualization choices). The embeddings of the entities and relations can be learned from dataset-visualization pairs. Also, AdaVis incorporates the attention mechanism into the inference framework. Attention can indicate the relative importance of data features for a dataset and provide fine-grained explainability. Our extensive evaluations through quantitative metric evaluations, case studies, and user interviews demonstrate the effectiveness of AdaVis.
翻译:自动化可视化推荐有助于快速生成有效的可视化结果,尤其适用于时间有限且缺乏数据可视化知识的用户。当前,利用机器学习(ML)技术实现端到端可视化推荐已成为重要趋势。然而,现有的ML方法隐含假设某一特定数据集仅存在一种合适的可视化方案,这在实际应用中往往不成立。此外,这些方法通常像黑箱般运作,用户难以理解推荐特定可视化结果的原因。为弥补这一研究空白,我们提出AdaVis——一种自适应且可解释的方法,能够为表格数据集推荐一个或多个合适的可视化方案。该方法基于盒嵌入知识图谱,对不同实体(即数据特征、数据集列、数据集与可视化选择)之间可能存在的一对多映射关系进行精确建模。实体与关系的嵌入表示可通过数据集-可视化样本对学习获得。同时,AdaVis将注意力机制融入推理框架,注意力权重能够指示数据特征对数据集的相对重要性,从而提供细粒度的可解释性。通过定量指标评估、案例研究及用户访谈等全面评估,我们验证了AdaVis的有效性。