Self-disclosure, while being common and rewarding in social media interaction, also poses privacy risks. In this paper, we take the initiative to protect the user-side privacy associated with online self-disclosure through detection and abstraction. We develop a taxonomy of 19 self-disclosure categories and curate a large corpus consisting of 4.8K annotated disclosure spans. We then fine-tune a language model for detection, achieving over 65% partial span F$_1$. We further conduct an HCI user study, with 82% of participants viewing the model positively, highlighting its real-world applicability. Motivated by the user feedback, we introduce the task of self-disclosure abstraction, which is paraphrasing disclosures into less specific terms while preserving their utility, e.g., "Im 16F" to "I'm a teenage girl". We explore various fine-tuning strategies, and our best model can generate diverse abstractions that moderately reduce privacy risks while maintaining high utility according to human evaluation. To help users in deciding which disclosures to abstract, we present a task of rating their importance for context understanding. Our fine-tuned model achieves 80% accuracy, on-par with GPT-3.5. Given safety and privacy considerations, we will only release our corpus to researchers who agree to ethical guidelines.
翻译:自我披露在社交媒体互动中既常见且有益,但也存在隐私风险。本文率先通过检测与抽象化来保护用户在线自我披露中的隐私。我们构建了一个包含19种自我披露类别的分类体系,并整理了包含4800个标注披露片段的语料库。随后微调语言模型进行检测,实现了超过65%的部分跨度F1值。进一步开展的人机交互用户研究中,82%的参与者对模型持积极态度,凸显其实用价值。受用户反馈启发,我们提出自我披露抽象化任务——将披露内容改写为更泛化的表述但保留其效用,例如将“我是16岁女生”改写为“我是个少女”。通过探索多种微调策略,最优模型能生成多样化抽象表述,根据人工评估,这些表述在保持高度效用的同时适度降低了隐私风险。为帮助用户决定需要抽象化的披露内容,我们提出了用于评估语境理解重要性的评级任务。微调模型达到80%的准确率,与GPT-3.5相当。出于安全与隐私考量,我们仅向同意伦理准则的研究人员开放语料库。