Humor understanding is an important and challenging research in natural language processing. As the popularity of pre-trained language models (PLMs), some recent work makes preliminary attempts to adopt PLMs for humor recognition and generation. However, these simple attempts do not substantially answer the question: {\em whether PLMs are capable of humor understanding?} This paper is the first work that systematically investigates the humor understanding ability of PLMs. For this purpose, a comprehensive framework with three evaluation steps and four evaluation tasks is designed. We also construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework. Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.
翻译:幽默理解是自然语言处理领域一项重要且具有挑战性的研究。随着预训练语言模型的普及,近期已有工作初步尝试将PLMs应用于幽默识别与生成。然而,这些简单尝试并未从根本上回答“PLMs是否具备幽默理解能力”这一核心问题。本文首次系统性地探究了PLMs的幽默理解能力。为此,我们设计了一个包含三个评估步骤与四项评估任务的综合评估框架,并构建了一个能够完全满足该框架所有数据需求的综合性中文幽默数据集。我们在该中文幽默数据集上的实证研究得出了一些有价值的发现,这些发现对未来优化PLMs在幽默理解与生成方面的能力具有重要指导意义。