Social Virtual Reality (VR) platforms provide immersive social experiences but also expose users to serious risks of online harassment. Existing safety measures are largely reactive, while proactive solutions that detect harassment behavior during an incident often depend on sensitive biometric data, raising privacy concerns. In this paper, we present HarassGuard, a vision-language model (VLM) based system that detects physical harassment in social VR using only visual input. We construct an IRB-approved harassment vision dataset, apply prompt engineering, and fine-tune VLMs to detect harassment behavior by considering contextual information in social VR. Experimental results demonstrate that HarassGuard achieves competitive performance compared to state-of-the-art baselines (i.e., LSTM/CNN, Transformer), reaching an accuracy of up to 88.09% in binary classification and 68.85% in multi-class classification. Notably, HarassGuard matches these baselines while using significantly fewer fine-tuning samples (200 vs. 1,115), offering unique advantages in contextual reasoning and privacy-preserving detection.
翻译:社交虚拟现实(VR)平台提供了沉浸式的社交体验,但也使用户面临严重的在线骚扰风险。现有的安全措施大多是被动应对的,而能在此类事件发生期间主动检测骚扰行为的解决方案往往依赖敏感的生物特征数据,从而引发隐私问题。本文提出HarassGuard——一种基于视觉语言模型(VLM)的系统,仅通过视觉输入即可检测社交VR中的物理骚扰行为。我们构建了经机构审查委员会批准的骚扰视觉数据集,通过提示工程并微调VLM,结合社交VR中的上下文信息来检测骚扰行为。实验结果表明,与最先进的基线方法(如LSTM/CNN、Transformer)相比,HarassGuard在二分类任务中准确率最高达88.09%,在多分类任务中达68.85%,展现了有竞争力的性能。值得注意的是,HarassGuard在达成与基线相当性能的同时,仅使用了显著更少的微调样本(200个 vs. 1,115个),在上下文推理和隐私保护检测方面具有独特优势。