We evaluate the ability of semantic parsers based on large language models (LLMs) to handle contextual utterances. In real-world settings, there typically exists only a limited number of annotated contextual utterances due to annotation cost, resulting in an imbalance compared to non-contextual utterances. Therefore, parsers must adapt to contextual utterances with a few training examples. We examine four major paradigms for doing so in conversational semantic parsing i.e., Parse-with-Utterance-History, Parse-with-Reference-Program, Parse-then-Resolve, and Rewrite-then-Parse. To facilitate such cross-paradigm comparisons, we construct SMCalFlow-EventQueries, a subset of contextual examples from SMCalFlow with additional annotations. Experiments with in-context learning and fine-tuning suggest that Rewrite-then-Parse is the most promising paradigm when holistically considering parsing accuracy, annotation cost, and error types.
翻译:我们评估了基于大语言模型的语义解析器处理上下文语句的能力。在实际应用中,由于标注成本限制,通常仅有少量带标注的上下文语句可用,导致其与非上下文语句之间存在数据不平衡问题。因此,解析器必须能够通过少量训练样本适应上下文语句。我们研究了在对话式语义解析中实现这一目标的四种主流范式:基于话语历史的解析、基于参考程序的解析、先解析后消解、以及先改写后解析。为促进跨范式比较,我们构建了SMCalFlow-EventQueries数据集,该数据集是从SMCalFlow中选取的上下文示例子集并附加额外标注。通过上下文学习与微调实验表明,综合考虑解析准确率、标注成本和错误类型时,先改写后解析范式最有前景。