This paper traces a conceptual shift from understanding vulnerability as a static, essentialized property of data subjects to examining how it is actively enacted through data practices. Unlike reflexive ethical frameworks focused on missing or counter-data, we address the condition of abundance inherent to platformized life-a context where a near inexhaustible mass of data points already exists, shifting the ethical challenge to the researcher's choices in operating upon this existing mass. We argue that the ethical integrity of data science depends not just on who is studied, but on how technical pipelines transform "vulnerable" individuals into data subjects whose vulnerability can be further precarized. We develop this argument through an AI for Social Good (AI4SG) case: a journalist's request to use computer vision to quantify child presence in monetized YouTube 'family vlogs' for regulatory advocacy. This case reveals a "protection paradox": how data-driven efforts to protect vulnerable subjects can inadvertently impose new forms of computational exposure, reductionism, and extraction. Using this request as a point of departure, we perform a methodological deconstruction of the AI pipeline to show how granular technical decisions are ethically constitutive. We contribute a reflexive ethics protocol that translates these insights into a reflexive roadmap for research ethics surrounding platformized data subjects. Organized around four critical junctures-dataset design, operationalization, inference, and dissemination-the protocol identifies technical questions and ethical tensions where well-intentioned work can slide into renewed extraction or exposure. For every decision point, the protocol offers specific prompts to navigate four cross-cutting vulnerabilizing factors: exposure, monetization, narrative fixing, and algorithmic optimization. Rather than uncritically...
翻译:本文追踪了从将脆弱性理解为数据主体的静态、本质化属性,到考察其如何通过数据实践被积极实施的这一概念转变。与聚焦于缺失或反身性数据伦理框架不同,我们直面平台化生活固有的数据丰裕状况——一个近乎取之不尽的数据点已经存在的语境,将伦理挑战转向研究者操作既有数据"团块"时的选择。我们认为,数据科学的伦理完整性不仅取决于研究谁,更取决于技术管道如何将"脆弱的"个体转化为其脆弱性可能进一步被不稳定的数据主体。我们通过一个人工智能促进社会公益(AI4SG)案例展开论证:一位记者请求使用计算机视觉技术量化YouTube"家庭博务"中儿童的出现,以推动监管倡导。该案例揭示了一种"保护悖论":以数据驱动保护脆弱主体的努力,可能不经意地强加新的计算暴露、还原论和提取形式。以这一请求为出发点,我们对AI管道进行方法论解构,展示细粒度技术决策如何具有伦理构成性。我们贡献了一个反身性伦理协议,将这些洞见转化为围绕平台化数据主体研究伦理的反身性路线图。该协议围绕四个关键节点组织——数据集设计、操作化、推理和传播——识别出技术问题和伦理张力,在这些节点上,善意的努力可能滑向新的提取或暴露。针对每个决策点,协议提供具体提示,以导航四种交叉脆弱化因素:暴露、货币化、叙事实定和算法优化。而非不加批判地...