Humour is a substantial element of human affect and cognition. Its automatic understanding can facilitate a more naturalistic human-device interaction and the humanisation of artificial intelligence. Current methods of humour detection are solely based on staged data making them inadequate for 'real-world' applications. We address this deficiency by introducing the novel Passau-Spontaneous Football Coach Humour (Passau-SFCH) dataset, comprising of about 11 hours of recordings. The Passau-SFCH dataset is annotated for the presence of humour and its dimensions (sentiment and direction) as proposed in Martin's Humor Style Questionnaire. We conduct a series of experiments, employing pretrained Transformers, convolutional neural networks, and expert-designed features. The performance of each modality (text, audio, video) for spontaneous humour recognition is analysed and their complementarity is investigated. Our findings suggest that for the automatic analysis of humour and its sentiment, facial expressions are most promising, while humour direction can be best modelled via text-based features. The results reveal considerable differences among various subjects, highlighting the individuality of humour usage and style. Further, we observe that a decision-level fusion yields the best recognition result. Finally, we make our code publicly available at https://www.github.com/EIHW/passau-sfch. The Passau-SFCH dataset is available upon request.
翻译:幽默是人类情感与认知的重要组成元素。对其自动理解有助于实现更自然的人机交互以及人工智能的人性化。当前幽默检测方法完全基于编排式数据,这使其难以适用于"真实世界"场景。为弥补这一缺陷,我们引入了全新的Passau-Spontaneous Football Coach Humour(帕绍自发性足球教练幽默,简称Passau-SFCH)数据集,包含约11小时的录音。该数据集基于Martin幽默风格问卷中提出的维度(情感倾向与指向性)进行了幽默存在性标注。我们开展了一系列实验,采用预训练Transformer、卷积神经网络及专家设计特征。分析了各模态(文本、音频、视频)在自发性幽默识别中的表现,并探究了其互补性。研究表明,在幽默及其情感的自动分析中,面部表情最具潜力,而幽默指向性则可通过文本特征进行最佳建模。结果揭示了不同受试者间的显著差异,突出了幽默使用方式与风格的个体性。此外,我们发现决策级融合能获得最佳识别效果。最后,我们已将代码公开于https://www.github.com/EIHW/passau-sfch,Passau-SFCH数据集可按需获取。