Self-disclosure, while being common and rewarding in social media interaction, also poses privacy risks. In this paper, we take the initiative to protect the user-side privacy associated with online self-disclosure through detection and abstraction. We develop a taxonomy of 19 self-disclosure categories and curate a large corpus consisting of 4.8K annotated disclosure spans. We then fine-tune a language model for detection, achieving over 65% partial span F$_1$. We further conduct an HCI user study, with 82% of participants viewing the model positively, highlighting its real-world applicability. Motivated by the user feedback, we introduce the task of self-disclosure abstraction, which is rephrasing disclosures into less specific terms while preserving their utility, e.g., "Im 16F" to "I'm a teenage girl". We explore various fine-tuning strategies, and our best model can generate diverse abstractions that moderately reduce privacy risks while maintaining high utility according to human evaluation. To help users in deciding which disclosures to abstract, we present a task of rating their importance for context understanding. Our fine-tuned model achieves 80% accuracy, on-par with GPT-3.5. Given safety and privacy considerations, we will only release our corpus and models to researcher who agree to the ethical guidelines outlined in Ethics Statement.
翻译:自我披露在社交媒体互动中虽普遍且有益,但也带来隐私风险。本文率先通过检测与抽象化方法,从用户端保护在线自我披露相关的隐私。我们构建了包含19类自我披露的分类体系,并整理了一个由4.8K条标注披露片段组成的大规模语料库。随后通过微调语言模型实现检测功能,其部分片段F$_1$值超过65%。我们进一步开展人机交互用户研究,82%的参与者对模型持积极评价,凸显了其实际应用价值。基于用户反馈,我们提出了自我披露抽象化任务——即在保持信息效用的前提下将披露内容改写为更模糊的表达(例如将“Im 16F”转化为“I'm a teenage girl”)。通过探索多种微调策略,我们最优模型能生成多样化的抽象表述,在人工评估中既能适度降低隐私风险,又能保持较高信息效用。为辅助用户决策需要抽象化的披露内容,我们设计了上下文重要性评级任务,微调模型达到80%准确率,与GPT-3.5性能持平。基于安全与隐私考量,我们仅向同意伦理声明中相关准则的研究者开放语料库与模型。