Large Language Models (LLMs), representing a significant achievement in artificial intelligence (AI) research, have demonstrated their ability in a multitude of tasks. This project aims to explore the capabilities of GPT-3.5, a leading example of LLMs, in processing the sentiment analysis of Internet memes. Memes, which include both verbal and visual aspects, act as a powerful yet complex tool for expressing ideas and sentiments, demanding an understanding of societal norms and cultural contexts. Notably, the detection and moderation of hateful memes pose a significant challenge due to their implicit offensive nature. This project investigates GPT's proficiency in such subjective tasks, revealing its strengths and potential limitations. The tasks include the classification of meme sentiment, determination of humor type, and detection of implicit hate in memes. The performance evaluation, using datasets from SemEval-2020 Task 8 and Facebook hateful memes, offers a comparative understanding of GPT responses against human annotations. Despite GPT's remarkable progress, our findings underscore the challenges faced by these models in handling subjective tasks, which are rooted in their inherent limitations including contextual understanding, interpretation of implicit meanings, and data biases. This research contributes to the broader discourse on the applicability of AI in handling complex, context-dependent tasks, and offers valuable insights for future advancements.
翻译:大型语言模型(LLMs)作为人工智能(AI)研究的重要成果,已展现出在多任务处理中的能力。本项目旨在探索GPT-3.5(LLMs中的典型代表)在互联网迷因情感分析中的处理能力。迷因兼具文字与视觉维度,是一种强大而复杂的思想与情感表达工具,要求理解社会规范与文化语境。值得注意的是,由于仇恨性迷因的隐晦冒犯特性,其检测与治理面临重大挑战。本研究考察GPT在此类主观任务中的能力,揭示其优势与潜在局限。任务包括迷因情感分类、幽默类型判定及隐性仇恨检测。基于SemEval-2020任务8与Facebook仇恨迷因数据集进行的性能评估,通过与人工标注的对比,全面呈现GPT的响应特征。尽管GPT取得显著进展,研究结果仍凸显了这些模型在处理主观任务时面临的挑战——其根源在于上下文理解、隐义解读及数据偏差等固有局限性。本研究为AI在复杂语境依赖性任务中的适用性探讨提供了新视角,并可为未来技术发展提供重要参考。