The exponential growth of social media has profoundly transformed how information is created, disseminated, and absorbed, exceeding any precedent in the digital age. Regrettably, this explosion has also spawned a significant increase in the online abuse of memes. Evaluating the negative impact of memes is notably challenging, owing to their often subtle and implicit meanings, which are not directly conveyed through the overt text and imagery. In light of this, large multimodal models (LMMs) have emerged as a focal point of interest due to their remarkable capabilities in handling diverse multimodal tasks. In response to this development, our paper aims to thoroughly examine the capacity of various LMMs (e.g. GPT-4V) to discern and respond to the nuanced aspects of social abuse manifested in memes. We introduce the comprehensive meme benchmark, GOAT-Bench, comprising over 6K varied memes encapsulating themes such as implicit hate speech, sexism, and cyberbullying, etc. Utilizing GOAT-Bench, we delve into the ability of LMMs to accurately assess hatefulness, misogyny, offensiveness, sarcasm, and harmful content. Our extensive experiments across a range of LMMs reveal that current models still exhibit a deficiency in safety awareness, showing insensitivity to various forms of implicit abuse. We posit that this shortfall represents a critical impediment to the realization of safe artificial intelligence. The GOAT-Bench and accompanying resources are publicly accessible at https://goatlmm.github.io/, contributing to ongoing research in this vital field.
翻译:社交媒体的指数级增长深刻改变了信息的创建、传播与吸收方式,其影响力在数字时代前所未有。遗憾的是,这种爆发也导致模因在线滥用的显著增加。由于模因常通过隐晦含蓄的含义而非直接明示的文字和图像传达,评估其负面影响尤为困难。鉴于此,多模态大模型(LMMs)因其处理多样化多模态任务的卓越能力而成为关注焦点。针对这一发展,本文旨在深入探究各类LMMs(如GPT-4V)识别和回应模因中隐含社交滥用的细微层面。我们引入了综合性模因基准GOAT-Bench,包含逾6000个主题涵盖隐晦仇恨言论、性别歧视及网络霸凌等内容的多样化模因。利用GOAT-Bench,我们深入分析了LMMs在准确评估仇恨度、厌女倾向、攻击性、讽刺性及有害内容方面的能力。基于对一系列LMMs的广泛实验,我们发现当前模型在安全意识方面仍存在缺陷,对各种形式的隐晦滥用缺乏敏感性。我们认为这一缺陷是实现安全人工智能的关键障碍。GOAT-Bench及配套资源已公开于https://goatlmm.github.io/,以助力这一重要领域的持续研究。