Pragmatic reasoning aims at resolving implicit meanings that commonly occur in real-life and is crucial for building communicative social agents. We introduce a new benchmark, Diplomat, aiming at a unified paradigm for pragmatic reasoning and situated conversational understanding. Compared with previous works that treat different figurative expressions (e.g., metaphor, sarcasm) as individual tasks, Diplomat provides a unified understanding towards general pragmatic understanding. Our dataset is created using Amazon Mechanical Turk ( AMT ), resulting in 4, 177 multi-turn dialogues. In company with the dataset, we propose two tasks: Pragmatic Identification and Reasoning and Conversational Question Answering. Experimental results with state-of-the-art (SOTA) neural architectures demonstrate that: 1) large language models ( LLMs) show poor performances in this subjective topic. 2) Context understanding is a crucial factor in building benign human-machine interaction. 3) Current models defect in the application of pragmatic reasoning. As a result, we call on more attention to improve the ability of context understanding, reasoning and implied meaning modeling.
翻译:语用推理旨在解析现实生活中常见的隐含意义,对构建具备交际能力的社会化智能体至关重要。本文提出新基准数据集Diplomat,致力于构建语用推理与情境对话理解的统一范式。与以往将不同修辞表达(如隐喻、反讽)作为独立任务处理的研究不同,Diplomat为通用语用理解提供了统一框架。该数据集通过亚马逊土耳其机器人平台构建,包含4,177轮多轮对话。伴随数据集,我们提出两大任务:语用识别与推理及对话问答。基于当前最先进神经网络架构的实验结果表明:1)大语言模型在此主观性议题上表现欠佳;2)上下文理解是构建良性人机交互的关键因素;3)现有模型在语用推理应用中存在缺陷。基于此,我们呼吁学界更多关注上下文理解、推理能力及隐含意义建模的提升。