The growing importance of data visualization in business intelligence and data science emphasizes the need for tools that can efficiently generate meaningful visualizations from large datasets. Existing tools fall into two main categories: human-powered tools (e.g., Tableau and PowerBI), which require intensive expert involvement, and AI-powered automated tools (e.g., Draco and Table2Charts), which often fall short of guessing specific user needs. In this paper, we aim to achieve the best of both worlds. Our key idea is to initially auto-generate a set of high-quality visualizations to minimize manual effort, then refine this process iteratively with user feedback to more closely align with their needs. To this end, we present HAIChart, a reinforcement learning-based framework designed to iteratively recommend good visualizations for a given dataset by incorporating user feedback. Specifically, we propose a Monte Carlo Graph Search-based visualization generation algorithm paired with a composite reward function to efficiently explore the visualization space and automatically generate good visualizations. We devise a visualization hints mechanism to actively incorporate user feedback, thus progressively refining the visualization generation module. We further prove that the top-k visualization hints selection problem is NP-hard and design an efficient algorithm. We conduct both quantitative evaluations and user studies, showing that HAIChart significantly outperforms state-of-the-art human-powered tools (21% better at Recall and 1.8 times faster) and AI-powered automatic tools (25.1% and 14.9% better in terms of Hit@3 and R10@30, respectively).
翻译:数据可视化在商业智能和数据科学中的重要性日益凸显,这强调了对能够从大型数据集中高效生成有意义可视化的工具的需求。现有工具主要分为两类:需要大量专家参与的人力驱动工具(如Tableau和PowerBI),以及通常难以准确推测用户具体需求的AI驱动自动化工具(如Draco和Table2Charts)。本文旨在实现两者的优势结合。我们的核心思想是:首先生成一组高质量的可视化以最小化人工工作量,然后通过用户反馈迭代优化此过程,以更紧密地贴合用户需求。为此,我们提出了HAIChart,这是一个基于强化学习的框架,旨在通过整合用户反馈,为给定数据集迭代推荐优质可视化。具体而言,我们提出了一种基于蒙特卡洛图搜索的可视化生成算法,并结合复合奖励函数,以高效探索可视化空间并自动生成优质可视化。我们设计了一种可视化提示机制来主动整合用户反馈,从而逐步优化可视化生成模块。我们进一步证明了top-k可视化提示选择问题是NP难的,并设计了一种高效算法。我们进行了定量评估和用户研究,结果表明HAIChart显著优于最先进的人力驱动工具(召回率提升21%,速度提升1.8倍)和AI驱动自动化工具(在Hit@3和R10@30指标上分别提升25.1%和14.9%)。