The complexity of navigating digital privacy, safety, and security threats often falls directly on users. This leads to users seeking help from family and peers, platforms and advice guides, dedicated communities, and even large language models (LLMs). As a precursor to improving resources across this ecosystem, our community needs to understand what help seeking looks like in the wild. To that end, we blend qualitative coding with LLM fine-tuning to sift through over one billion Reddit posts from the last four years to identify where and for what users seek digital privacy, safety, or security help. We isolate three million relevant posts with 93% precision and recall and automatically annotate each with the topics discussed (e.g., security tools, privacy configurations, scams, account compromise, content moderation, and more). We use this dataset to understand the scope and scale of help seeking, the communities that provide help, and the types of help sought. Our work informs the development of better resources for users (e.g., user guides or LLM help-giving agents) while underscoring the inherent challenges of supporting users through complex combinations of threats, platforms, mitigations, context, and emotions.
翻译:应对数字隐私、安全与防护威胁的复杂性往往直接落在用户身上。这导致用户向家人与同伴、平台与指导手册、专业社区乃至大型语言模型(LLM)寻求帮助。作为改进这一生态系统中各类资源的先导步骤,我们学界需要理解真实场景中的求助行为究竟呈现何种形态。为此,我们融合质性编码与LLM微调技术,对过去四年间超过十亿条Reddit发帖进行筛选,以识别用户在何处、因何种问题寻求数字隐私、安全或防护方面的帮助。我们以93%的精确率与召回率提取出三百万条相关发帖,并通过自动标注为每条帖子标记所讨论的主题(例如:安全工具、隐私配置、网络诈骗、账户盗用、内容审核等)。利用该数据集,我们深入剖析了求助行为的范围与规模、提供帮助的社区类型以及所寻求帮助的具体类别。本研究不仅为开发更优质的用户资源(如使用指南或LLM助手机器人)提供了依据,同时揭示了在应对威胁、平台、缓解措施、情境与情感等多重因素交织的复杂局面时,为用户提供支持所面临的固有挑战。