Humor is a central aspect of human communication that has not been solved for artificial agents so far. Large language models (LLMs) are increasingly able to capture implicit and contextual information. Especially, OpenAI's ChatGPT recently gained immense public attention. The GPT3-based model almost seems to communicate on a human level and can even tell jokes. Humor is an essential component of human communication. But is ChatGPT really funny? We put ChatGPT's sense of humor to the test. In a series of exploratory experiments around jokes, i.e., generation, explanation, and detection, we seek to understand ChatGPT's capability to grasp and reproduce human humor. Since the model itself is not accessible, we applied prompt-based experiments. Our empirical evidence indicates that jokes are not hard-coded but mostly also not newly generated by the model. Over 90% of 1008 generated jokes were the same 25 Jokes. The system accurately explains valid jokes but also comes up with fictional explanations for invalid jokes. Joke-typical characteristics can mislead ChatGPT in the classification of jokes. ChatGPT has not solved computational humor yet but it can be a big leap toward "funny" machines.
翻译:幽默是人类交流的核心维度,迄今尚未被人工智能体攻克。大型语言模型(LLMs)日益擅长捕捉隐含与语境信息,尤其是OpenAI的ChatGPT近期引发广泛公众关注。基于GPT-3的模型几乎能实现类人交流,甚至能够讲笑话。幽默是人类沟通的核心组成部分,但ChatGPT真的能逗人发笑吗?我们通过一系列围绕笑话生成、解释与检测的探索性实验,试图理解ChatGPT把握和复现人类幽默的能力。由于模型本身不可访问,我们采用基于提示的实验方法。实证证据表明,模型生成的笑话虽非硬编码,但绝大多数也并非全新产物——1008个生成笑话中超过90%为相同的25个笑话。系统能准确解释有效笑话,却也会为无效笑话虚构解释。笑话的典型特征可能误导ChatGPT对笑话的分类。ChatGPT尚未解决计算幽默难题,但或可成为迈向“幽默机器”的重要一步。