Social norms fundamentally shape interpersonal communication. We present NormDial, a high-quality dyadic dialogue dataset with turn-by-turn annotations of social norm adherences and violations for Chinese and American cultures. Introducing the task of social norm observance detection, our dataset is synthetically generated in both Chinese and English using a human-in-the-loop pipeline by prompting large language models with a small collection of expert-annotated social norms. We show that our generated dialogues are of high quality through human evaluation and further evaluate the performance of existing large language models on this task. Our findings point towards new directions for understanding the nuances of social norms as they manifest in conversational contexts that span across languages and cultures.
翻译:摘要:社交规范从根本上塑造了人际沟通。我们提出了NormDial,这是一个高质量的二元对话数据集,针对中美文化提供了逐轮标注的社交规范遵循与违反情况。引入社交规范遵守检测任务后,我们的数据集通过人工参与流程合成生成中文和英文版本,该流程利用少量专家标注的社交规范提示大型语言模型。通过人工评估,我们证明了生成的对话具有高质量,并进一步评估了现有大型语言模型在此任务上的表现。我们的研究结果揭示了社交规范在跨语言和跨文化对话语境中表现出的细微差别,为理解这些差异指明了新方向。