Modern language models, while sophisticated, exhibit some inherent shortcomings, particularly in conversational settings. We claim that many of the observed shortcomings can be attributed to violation of one or more conversational principles. By drawing upon extensive research from both the social science and AI communities, we propose a set of maxims -- quantity, quality, relevance, manner, benevolence, and transparency -- for describing effective human-AI conversation. We first justify the applicability of the first four maxims (from Grice) in the context of human-AI interactions. We then argue that two new maxims, benevolence (concerning the generation of, and engagement with, harmful content) and transparency (concerning recognition of one's knowledge boundaries, operational constraints, and intents), are necessary for addressing behavior unique to modern human-AI interactions. We evaluate the degree to which various language models are able to understand these maxims and find that models possess an internal prioritization of principles that can significantly impact their ability to interpret the maxims accurately.
翻译:现代语言模型虽然复杂,但在对话环境中表现出一些固有的缺陷。我们认为,许多观察到的缺陷可归因于违反了一条或多条会话原则。通过借鉴社会科学和人工智能社区的广泛研究,我们提出了一套用于描述有效人机对话的准则——数量准则、质量准则、关联准则、方式准则、善意准则和透明准则。我们首先论证了前四条准则(源自格莱斯)在人机交互语境中的适用性。随后我们提出,两条新准则——善意准则(涉及有害内容的生成与参与)和透明准则(涉及对自身知识边界、操作约束及意图的认知)——对于解决现代人机交互特有的行为问题是必要的。我们评估了不同语言模型理解这些准则的程度,发现模型内部存在对原则的优先级排序,这会显著影响其准确解读准则的能力。