In recent years, machine learning (ML) has relied heavily on crowdworkers both for building datasets and for addressing research questions requiring human interaction or judgment. The diverse tasks performed and uses of the data produced render it difficult to determine when crowdworkers are best thought of as workers (versus human subjects). These difficulties are compounded by conflicting policies, with some institutions and researchers regarding all ML crowdworkers as human subjects and others holding that they rarely constitute human subjects. Notably few ML papers involving crowdwork mention IRB oversight, raising the prospect of non-compliance with ethical and regulatory requirements. We investigate the appropriate designation of ML crowdsourcing studies, focusing our inquiry on natural language processing to expose unique challenges for research oversight. Crucially, under the U.S. Common Rule, these judgments hinge on determinations of aboutness, concerning both whom (or what) the collected data is about and whom (or what) the analysis is about. We highlight two challenges posed by ML: the same set of workers can serve multiple roles and provide many sorts of information; and ML research tends to embrace a dynamic workflow, where research questions are seldom stated ex ante and data sharing opens the door for future studies to aim questions at different targets. Our analysis exposes a potential loophole in the Common Rule, where researchers can elude research ethics oversight by splitting data collection and analysis into distinct studies. Finally, we offer several policy recommendations to address these concerns.
翻译:近年来,机器学习(ML)在构建数据集以及解决需要人类交互或判断的研究问题时,严重依赖众包工作者。这些工作者所执行的任务多样性以及所产生数据的用途,使得难以确定众包工作者应被视为工作者(与人类受试者相对)的最佳时机。这些困难因政策冲突而加剧——部分机构与研究者将所有ML众包工作者视为人类受试者,而另一些则认为他们很少构成人类受试者。值得注意的是,涉及众包工作的ML论文极少提及机构审查委员会(IRB)的监督,这引发了对伦理与监管要求合规性不足的担忧。我们研究了ML众包研究的适当分类,将研究聚焦于自然语言处理领域,以揭示研究监督面临的独特挑战。关键的是,根据美国通用规则,此类判定取决于“关于性”的界定,既涉及所收集数据是关于谁(或什么),也涉及分析是关于谁(或什么)。我们强调了ML带来的两个挑战:同一组工作者可能承担多重角色并提供多种类型信息;ML研究倾向于采用动态工作流程,研究问题很少事先声明,且数据共享为未来研究针对不同目标提出问题打开了大门。我们的分析揭示了通用规则中一个潜在漏洞:研究者可通过将数据收集与分析拆分为独立研究来规避研究伦理监督。最后,我们针对这些关切提出了若干政策建议。