Automatically generating data visualizations in response to human utterances on datasets necessitates a deep semantic understanding of the data utterance, including implicit and explicit references to data attributes, visualization tasks, and necessary data preparation steps. Natural Language Interfaces (NLIs) for data visualization have explored ways to infer such information, yet challenges persist due to inherent uncertainty in human speech. Recent advances in Large Language Models (LLMs) provide an avenue to address these challenges, but their ability to extract the relevant semantic information remains unexplored. In this study, we evaluate four publicly available LLMs (GPT-4, Gemini-Pro, Llama3, and Mixtral), investigating their ability to comprehend utterances even in the presence of uncertainty and identify the relevant data context and visual tasks. Our findings reveal that LLMs are sensitive to uncertainties in utterances. Despite this sensitivity, they are able to extract the relevant data context. However, LLMs struggle with inferring visualization tasks. Based on these results, we highlight future research directions on using LLMs for visualization generation.
翻译:针对数据集上的人类表述自动生成数据可视化,需要对数据表述进行深层次的语义理解,包括对数据属性、可视化任务及必要数据准备步骤的隐式与显式指涉。面向数据可视化的自然语言界面已探索了多种推断此类信息的方法,但由于人类语言固有的不确定性,相关挑战依然存在。大型语言模型的最新进展为解决这些挑战提供了途径,但其提取相关语义信息的能力尚未得到充分探索。本研究评估了四种公开可用的大型语言模型(GPT-4、Gemini-Pro、Llama3和Mixtral),考察它们在存在不确定性的情况下理解语言表述、识别相关数据上下文及可视化任务的能力。研究发现:大型语言模型对表述中的不确定性较为敏感;尽管存在这种敏感性,它们仍能提取相关数据上下文;然而,大型语言模型在推断可视化任务方面存在困难。基于这些结果,我们展望了利用大型语言模型进行可视化生成的未来研究方向。