The use of natural language interfaces (NLIs) for the creation of charts is becoming increasingly popular due to the intuitiveness of natural language interactions. One key challenge in this approach is to accurately capture user intents and transform them to proper chart specifications. This obstructs the wide use of NLI in chart generation, as users' natural language inputs are generally abstract (i.e., ambiguous or under-specified), without a clear specification of visual encodings. Recently, pre-trained large language models (LLMs) have exhibited superior performance in understanding and generating natural language, demonstrating great potential for downstream tasks. Inspired by this major trend, we propose ChartGPT, generating charts from abstract natural language inputs. However, LLMs are struggling to address complex logic problems. To enable the model to accurately specify the complex parameters and perform operations in chart generation, we decompose the generation process into a step-by-step reasoning pipeline, so that the model only needs to reason a single and specific sub-task during each run. Moreover, LLMs are pre-trained on general datasets, which might be biased for the task of chart generation. To provide adequate visualization knowledge, we create a dataset consisting of abstract utterances and charts and improve model performance through fine-tuning. We further design an interactive interface for ChartGPT that allows users to check and modify the intermediate outputs of each step. The effectiveness of the proposed system is evaluated through quantitative evaluations and a user study.
翻译:自然语言界面(NLI)在图表创建中的应用因其交互的直观性而日益普及。该方法的核心挑战在于准确捕捉用户意图并将其转化为恰当的图表规范。由于用户的自然语言输入通常较为抽象(即存在歧义或描述不完整),未能明确指定视觉编码方式,这阻碍了NLI在图表生成中的广泛采用。近期,预训练大语言模型(LLMs)在自然语言理解与生成方面展现出卓越性能,彰显了其在下游任务中的巨大潜力。受此趋势启发,我们提出ChartGPT,旨在从抽象自然语言输入中生成图表。然而,LLMs在处理复杂逻辑问题时仍存在不足。为使模型能够精确指定图表生成中的复杂参数并执行操作,我们将生成过程分解为逐步推理管线,从而让模型每次只需推理单一且具体的子任务。此外,LLMs基于通用数据集预训练,可能对图表生成任务存在偏差。为提供充分的可视化知识,我们构建了一个包含抽象表述与图表的数据集,并通过微调提升模型性能。我们进一步为ChartGPT设计了交互界面,使用户能够检查并修改各步骤的中间输出结果。通过定量评估与用户研究,验证了所提系统的有效性。