Child Sexual Abuse Imagery (CSAI) classification systems are needed solutions for lessening the psychological impacts often felt by law enforcement agents responsible for evaluating these materials and for efficient removal of these materials from the web. However, due to the nature of the task, researching and developing such systems is not a trivial endeavor. The images are highly sensitive, and the related datasets are under restrictive access regimes, which means most studies in the area are not reproducible or distributable and are therefore hard to compare and validate. More concerning still, most models for this task today lack an aspect often desired by law enforcement agents: explainability. In this paper, we apply an ensemble of Proxy Tasks -- tasks that correlate to CSAI classification -- yielding improvements in reproducibility, explainability, and security for distribution. This concept is applied for the first time to real CSAI, with a novel selection of relevant Proxy Tasks (selected from the CSAI literature) and training adaptations to the original framework. Our final model achieves competitive results, yielding 91.9% balanced accuracy on the RCPD dataset with the best Proxy Task combination. We furthermore contrast these results with the best-in-class representation learning model, DINO, and show that our ensemble improves accuracy and provides explanations for its classification results, a feature that a single deep learning model can seldom provide.
翻译:儿童性虐待图像分类系统是减轻执法人员心理影响及高效清除网络非法内容的必要解决方案。然而,此类系统的研发工作因任务特殊性而面临巨大挑战:图像高度敏感且相关数据集受严格访问限制,导致该领域多数研究难以复现、传播,更遑论比较与验证。更令人担忧的是,当前大多数模型缺乏执法人员迫切需要的可解释性。本文提出一种代理任务集成方法(即与儿童性虐待图像分类相关的关联任务),在可复现性、可解释性和分发安全性方面取得改进。该方案首次应用于真实儿童性虐待图像,创新性地从相关文献中选取了有效的代理任务组合,并对原始框架进行了适应性训练改进。最终模型在RCPD数据集上采用最优代理任务组合时,达到91.9%的均衡准确率。此外,我们与当前最优表征学习模型DINO进行对比,证明本集成方法在提升准确率的同时,能为分类结果提供可解释性——这是单一深度学习模型难以实现的特性。