Hate speech is a severe issue that affects many online platforms. So far, several studies have been performed to develop robust hate speech detection systems. Large language models like ChatGPT have recently shown great potential in performing several tasks, including hate speech detection. However, it is crucial to comprehend the limitations of these models to build more robust hate speech detection systems. Thus to bridge the gap, our study aims to evaluate the weaknesses of the ChatGPT model in detecting hate speech at a granular level across 11 languages. In addition, we investigate the influence of complex emotions, such as the use of emojis in hate speech, on the performance of the ChatGPT model. Through our analysis, we examine and investigate the errors made by the model, shedding light on its shortcomings in detecting certain types of hate speech and highlighting the need for further research and improvements in hate speech detection.
翻译:仇恨言论是一个严重影响众多在线平台的严峻问题。迄今为止,已有若干研究致力于开发鲁棒的仇恨言论检测系统。近年来,如ChatGPT这样的大型语言模型在包括仇恨言论检测在内的多项任务中展现出巨大潜力。然而,理解这些模型的局限性对于构建更鲁棒的仇恨言论检测系统至关重要。为填补这一空白,本研究旨在从细粒度层面评估ChatGPT模型在11种语言中检测仇恨言论的弱点。此外,我们探究了复杂情感(例如仇恨言论中使用表情符号)对ChatGPT模型性能的影响。通过分析,我们检查并研究了模型所犯的错误,揭示了其在检测特定类型仇恨言论方面的不足,并强调了在仇恨言论检测领域进行进一步研究和改进的必要性。