From Vulnerable Data Subjects to Vulnerabilizing Data Practices: Navigating the Protection Paradox in AI-Based Analyses of Platformized Lives

from arxiv, In The 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT '26), June 25-28, 2026, Montreal, QC, Canada. ACM, New York, NY, USA, 23 pages

This paper traces a conceptual shift from understanding vulnerability as a static, essentialized property of data subjects to examining how it is actively enacted through data practices. Unlike reflexive ethical frameworks focused on missing or counter-data, we address the condition of abundance inherent to platformized life-a context where a near inexhaustible mass of data points already exists, shifting the ethical challenge to the researcher's choices in operating upon this existing mass. We argue that the ethical integrity of data science depends not just on who is studied, but on how technical pipelines transform "vulnerable" individuals into data subjects whose vulnerability can be further precarized. We develop this argument through an AI for Social Good (AI4SG) case: a journalist's request to use computer vision to quantify child presence in monetized YouTube 'family vlogs' for regulatory advocacy. This case reveals a "protection paradox": how data-driven efforts to protect vulnerable subjects can inadvertently impose new forms of computational exposure, reductionism, and extraction. Using this request as a point of departure, we perform a methodological deconstruction of the AI pipeline to show how granular technical decisions are ethically constitutive. We contribute a reflexive ethics protocol that translates these insights into a reflexive roadmap for research ethics surrounding platformized data subjects. Organized around four critical junctures-dataset design, operationalization, inference, and dissemination-the protocol identifies technical questions and ethical tensions where well-intentioned work can slide into renewed extraction or exposure. For every decision point, the protocol offers specific prompts to navigate four cross-cutting vulnerabilizing factors: exposure, monetization, narrative fixing, and algorithmic optimization. Rather than uncritically...

翻译：本文追踪了从将脆弱性理解为数据主体的静态、本质化属性，到考察其如何通过数据实践被积极实施的这一概念转变。与聚焦于缺失或反身性数据伦理框架不同，我们直面平台化生活固有的数据丰裕状况——一个近乎取之不尽的数据点已经存在的语境，将伦理挑战转向研究者操作既有数据"团块"时的选择。我们认为，数据科学的伦理完整性不仅取决于研究谁，更取决于技术管道如何将"脆弱的"个体转化为其脆弱性可能进一步被不稳定的数据主体。我们通过一个人工智能促进社会公益（AI4SG）案例展开论证：一位记者请求使用计算机视觉技术量化YouTube"家庭博务"中儿童的出现，以推动监管倡导。该案例揭示了一种"保护悖论"：以数据驱动保护脆弱主体的努力，可能不经意地强加新的计算暴露、还原论和提取形式。以这一请求为出发点，我们对AI管道进行方法论解构，展示细粒度技术决策如何具有伦理构成性。我们贡献了一个反身性伦理协议，将这些洞见转化为围绕平台化数据主体研究伦理的反身性路线图。该协议围绕四个关键节点组织——数据集设计、操作化、推理和传播——识别出技术问题和伦理张力，在这些节点上，善意的努力可能滑向新的提取或暴露。针对每个决策点，协议提供具体提示，以导航四种交叉脆弱化因素：暴露、货币化、叙事实定和算法优化。而非不加批判地...

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《人工智能增强监视分析：利用跨网络、陆地、空中及海上领域的威胁向量实时建模》

专知会员服务

29+阅读 · 2025年12月11日

数据质量维度的实践展开：一项综述

专知会员服务

20+阅读 · 2025年7月28日

面向稳健和安全的具身AI：关于脆弱性与攻击的综述

专知会员服务

19+阅读 · 2025年2月20日

《数据价值化与数据要素市场发展报告（2024年）》下载

专知会员服务

35+阅读 · 2024年10月6日