Image-guided topic modeling for interpretable privacy classification

Predicting and explaining the private information contained in an image in human-understandable terms is a complex and contextual task. This task is challenging even for large language models. To facilitate the understanding of privacy decisions, we propose to predict image privacy based on a set of natural language content descriptors. These content descriptors are associated with privacy scores that reflect how people perceive image content. We generate descriptors with our novel Image-guided Topic Modeling (ITM) approach. ITM leverages, via multimodality alignment, both vision information and image textual descriptions from a vision language model. We use the ITM-generated descriptors to learn a privacy predictor, Priv$\times$ITM, whose decisions are interpretable by design. Our Priv$\times$ITM classifier outperforms the reference interpretable method by 5 percentage points in accuracy and performs comparably to the current non-interpretable state-of-the-art model.

翻译：以人类可理解的术语预测和解释图像中包含的隐私信息是一项复杂且依赖上下文的任务。即使对于大型语言模型而言，这项任务也具有挑战性。为了促进对隐私决策的理解，我们提出基于一组自然语言内容描述符来预测图像隐私。这些内容描述符与反映人们对图像内容感知的隐私评分相关联。我们通过新颖的图像引导主题建模方法生成描述符。该方法通过多模态对齐，同时利用视觉信息以及来自视觉语言模型的图像文本描述。我们使用ITM生成的描述符来学习一个隐私预测器Priv$\times$ITM，其决策过程在设计上即是可解释的。我们的Priv$\times$ITM分类器在准确率上比参考的可解释方法高出5个百分点，并与当前不可解释的最先进模型性能相当。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【干货书】Python自然语言处理，504页pdf

专知会员服务

132+阅读 · 2021年6月18日

（CVPR2021）基于结构保持的弱监督目标定位

专知会员服务

21+阅读 · 2021年5月1日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日