The field of data visualisation has long aimed to devise solutions for generating visualisations directly from natural language text. Research in Natural Language Interfaces (NLIs) has contributed towards the development of such techniques. However, the implementation of workable NLIs has always been challenging due to the inherent ambiguity of natural language, as well as in consequence of unclear and poorly written user queries which pose problems for existing language models in discerning user intent. Instead of pursuing the usual path of developing new iterations of language models, this study uniquely proposes leveraging the advancements in pre-trained large language models (LLMs) such as ChatGPT and GPT-3 to convert free-form natural language directly into code for appropriate visualisations. This paper presents a novel system, Chat2VIS, which takes advantage of the capabilities of LLMs and demonstrates how, with effective prompt engineering, the complex problem of language understanding can be solved more efficiently, resulting in simpler and more accurate end-to-end solutions than prior approaches. Chat2VIS shows that LLMs together with the proposed prompts offer a reliable approach to rendering visualisations from natural language queries, even when queries are highly misspecified and underspecified. This solution also presents a significant reduction in costs for the development of NLI systems, while attaining greater visualisation inference abilities compared to traditional NLP approaches that use hand-crafted grammar rules and tailored models. This study also presents how LLM prompts can be constructed in a way that preserves data security and privacy while being generalisable to different datasets. This work compares the performance of GPT-3, Codex and ChatGPT across a number of case studies and contrasts the performances with prior studies.
翻译:数据可视化领域长期致力于设计直接从自然语言文本生成可视化的解决方案。自然语言接口(NLI)的研究促进了此类技术的发展。然而,由于自然语言固有的模糊性,以及用户查询表述不清、措辞不当导致现有语言模型难以准确理解用户意图,实现可运行的NLI系统始终面临挑战。本研究未沿袭开发新型语言模型的常规路径,而是创造性地提出利用预训练大型语言模型(LLM)的进展——如ChatGPT和GPT-3——将自由形式的自然语言直接转换为可视化代码。本文提出创新系统Chat2VIS,充分发挥LLM能力,并通过有效的提示工程证明:语言理解这一复杂问题可被更高效地解决,从而获得比先前方法更简洁、更精确的端到端方案。实验表明,即使面对高度歧义且描述不足的查询,Chat2VIS结合所设计的提示仍能为自然语言查询提供可靠的可视化渲染方案。该方案在显著降低NLI系统开发成本的同时,还获得了超越传统自然语言处理(NLP)方法(如人工构建语法规则与定制模型)的可视化推理能力。本研究同时展示了如何构建既能保护数据安全与隐私、又可泛化至不同数据集的LLM提示。本文通过多个案例研究比较了GPT-3、Codex和ChatGPT的性能表现,并与先前研究的结果进行了对比分析。