While computer systems that allow users to interact through conversational natural language (i.e., chatbots) have existed for many years, varying types of applications advertising AI companionship (e.g., Character AI, Replika) have proliferated in recent years due to advancements in large language models. Our work offers a threat model encompassing two distinct risk categories: harms posed to users by AI companion applications, and harms enabled by malicious users exploiting application features. To further understand this application ecosystem, we identified 489 unique apps from the App Store and Play Store that advertised AI companionship. We then systematically conducted and analyzed walkthroughs of a stratified sample of 30 apps with respect to our threat model. Through our analysis, we categorize broader ecosystem trends that provide context for understanding threats and identify specific threats related to sensitive data collection and sharing, anthropomorphism, engagement mechanisms, sexual interactions and media, as well as the ingestion and reconstruction of likeness, including the potential for generating synthetic nonconsensual intimate imagery. This study provides a foundational security perspective on the AI companion application ecosystem and informs future research within and beyond this field, policy, and technical development. Content warning: This paper includes descriptions of applications that can be used to create synthetic nonconsensual representations, including explicit imagery, as well as discussion of self-harm and suicidal ideation.
翻译:尽管允许用户通过对话式自然语言(即聊天机器人)进行交互的计算机系统已存在多年,但由于大型语言模型的进步,近年来各类宣传AI伴侣功能的应用(如Character AI、Replika)激增。本研究提出了一个威胁模型,涵盖两个不同的风险类别:AI伴侣应用对用户造成的伤害,以及恶意用户利用应用功能所促成的伤害。为深入理解该应用生态系统,我们从App Store和Play Store中识别出489款宣传AI伴侣功能的独立应用,并基于威胁模型对分层抽样的30款应用进行了系统性的走查分析。通过分析,我们归类了更广泛的生态系统趋势——这些趋势为理解威胁提供了背景,并识别出与敏感数据收集共享、拟人化设计、参与机制、性互动与媒体内容,以及肖像摄取与重建(包括生成非自愿合成亲密图像的可能性)相关的具体威胁。本研究为AI伴侣应用生态系统提供了基础性的安全视角,并为该领域内外的未来研究、政策制定及技术发展提供了参考。内容警示:本文包含对可用于创建非自愿合成表征(含露骨图像)应用的描述,同时涉及自残及自杀意念的讨论。