Privacy is a human right. It ensures that individuals are free to engage in discussions, participate in groups, and form relationships online or offline without fear of their data being inappropriately harvested, analyzed, or otherwise used to harm them. Preserving privacy has emerged as a critical factor in research, particularly in the computational social science (CSS), artificial intelligence (AI) and data science domains, given their reliance on individuals' data for novel insights. The increasing use of advanced computational models stands to exacerbate privacy concerns because, if inappropriately used, they can quickly infringe privacy rights and lead to adverse effects for individuals -- especially vulnerable groups -- and society. We have already witnessed a host of privacy issues emerge with the advent of large language models (LLMs), such as ChatGPT, which further demonstrate the importance of embedding privacy from the start. This article contributes to the field by discussing the role of privacy and the issues that researchers working in CSS, AI, data science and related domains are likely to face. It then presents several key considerations for researchers to ensure participant privacy is best preserved in their research design, data collection and use, analysis, and dissemination of research results.
翻译:隐私是一项基本人权。它确保个人能够自由参与讨论、加入团体并建立线上或线下的关系,而无需担忧其数据被不当采集、分析或以其他方式用于损害其权益。鉴于计算社会科学(CSS)、人工智能(AI)和数据科学领域依赖个人数据获取新见解,保护隐私已成为相关研究中的关键因素。先进计算模型的日益广泛应用可能加剧隐私问题,因为若使用不当,这些模型可能迅速侵犯隐私权,并对个人——尤其是弱势群体——及社会造成负面影响。随着ChatGPT等大语言模型(LLMs)的出现,我们已经目睹了大量隐私问题的涌现,这进一步证明了从研究初始阶段即嵌入隐私保护的重要性。本文通过探讨隐私的作用,以及计算社会科学、人工智能、数据科学及相关领域研究者可能面临的问题,为该领域作出贡献。文章进而提出若干关键考量,以帮助研究者在研究设计、数据收集与使用、分析及研究成果传播等环节中最大限度地保护参与者隐私。