Sarcasm is a way of verbal irony where someone says the opposite of what they mean, often to ridicule a person, situation, or idea. It is often difficult to detect sarcasm in the dialogue since detecting sarcasm should reflect the context (i.e., dialogue history). In this paper, we introduce a new dataset for the Korean dialogue sarcasm detection task, KoCoSa (Korean Context-aware Sarcasm Detection Dataset), which consists of 12.8K daily Korean dialogues and the labels for this task on the last response. To build the dataset, we propose an efficient sarcasm detection dataset generation pipeline: 1) generating new sarcastic dialogues from source dialogues with large language models, 2) automatic and manual filtering of abnormal and toxic dialogues, and 3) human annotation for the sarcasm detection task. We also provide a simple but effective baseline for the Korean sarcasm detection task trained on our dataset. Experimental results on the dataset show that our baseline system outperforms strong baselines like large language models, such as GPT-3.5, in the Korean sarcasm detection task. We show that the sarcasm detection task relies deeply on the existence of sufficient context. We will release the dataset at https://github.com/Yu-billie/KoCoSa_sarcasm_detection.
翻译:讽刺是一种言语反讽方式,说话者表达与字面意思相反的内容,常旨在嘲弄某人、某种情境或观念。由于讽刺检测需要反映上下文(即对话历史),在对话中识别讽刺往往颇为困难。本文提出一个面向韩语对话讽刺检测任务的新数据集KoCoSa(韩语上下文感知讽刺检测数据集),包含12.8万条日常韩语对话及其最后一轮回复的讽刺检测标签。为构建该数据集,我们提出一套高效的讽刺检测数据集生成流水线:1)利用大语言模型基于源对话生成新的讽刺对话,2)通过自动与人工筛选剔除异常及有害对话,3)针对讽刺检测任务进行人工标注。我们还提供了一个基于该数据集训练的简单而有效的韩语讽刺检测基线模型。数据集上的实验结果表明,在韩语讽刺检测任务中,我们的基线系统优于GPT-3.5等大语言模型的强基线。实验揭示讽刺检测任务高度依赖充分上下文信息。我们将于https://github.com/Yu-billie/KoCoSa_sarcasm_detection公开发布该数据集。