The cultural landscape of interactions with dialogue agents is a compelling yet relatively unexplored territory. It's clear that various sociocultural aspects -- from communication styles and beliefs to shared metaphors and knowledge -- profoundly impact these interactions. To delve deeper into this dynamic, we introduce cuDialog, a first-of-its-kind benchmark for dialogue generation with a cultural lens. We also develop baseline models capable of extracting cultural attributes from dialogue exchanges, with the goal of enhancing the predictive accuracy and quality of dialogue agents. To effectively co-learn cultural understanding and multi-turn dialogue predictions, we propose to incorporate cultural dimensions with dialogue encoding features. Our experimental findings highlight that incorporating cultural value surveys boosts alignment with references and cultural markers, demonstrating its considerable influence on personalization and dialogue quality. To facilitate further exploration in this exciting domain, we publish our benchmark publicly accessible at https://github.com/yongcaoplus/cuDialog.
翻译:对话代理与文化背景之间的交互景观是一个引人入胜但相对未被充分探索的领域。显而易见的,各种社会文化因素——从沟通风格和信仰到共享隐喻和知识——深刻影响着这些交互。为了深入探究这一动态,我们引入了cuDialog,这是首个以文化视角为基准的对话生成数据集。我们还开发了能够从对话交流中提取文化属性的基线模型,旨在提升对话代理的预测准确性和生成质量。为了实现文化理解与多轮对话预测的协同学习,我们提出将文化维度与对话编码特征相结合。实验结果表明,融入文化价值观调查可以增强与参考文本和文化标记的一致性,展示了其在个性化和对话质量方面的显著影响力。为促进这一激动人心领域的进一步探索,我们公开发布了该基准数据集,访问地址为:https://github.com/yongcaoplus/cuDialog。