We initiate a principled study of algorithmic collective action on digital platforms that deploy machine learning algorithms. We propose a simple theoretical model of a collective interacting with a firm's learning algorithm. The collective pools the data of participating individuals and executes an algorithmic strategy by instructing participants how to modify their own data to achieve a collective goal. We investigate the consequences of this model in three fundamental learning-theoretic settings: the case of a nonparametric optimal learning algorithm, a parametric risk minimizer, and gradient-based optimization. In each setting, we come up with coordinated algorithmic strategies and characterize natural success criteria as a function of the collective's size. Complementing our theory, we conduct systematic experiments on a skill classification task involving tens of thousands of resumes from a gig platform for freelancers. Through more than two thousand model training runs of a BERT-like language model, we see a striking correspondence emerge between our empirical observations and the predictions made by our theory. Taken together, our theory and experiments broadly support the conclusion that algorithmic collectives of exceedingly small fractional size can exert significant control over a platform's learning algorithm.
翻译:我们针对部署机器学习算法的数字平台上的算法集体行动展开了系统性研究。我们提出了一个集体与企业学习算法交互的简化理论模型。该集体汇集参与者的数据,并通过指导参与者如何修改自身数据以实现集体目标来执行算法策略。我们在三个基本学习理论场景中探究该模型的后果:非参数最优学习算法场景、参数化风险最小化器场景以及基于梯度的优化场景。在每种场景中,我们设计出协调的算法策略,并将自然成功准则表征为集体规模的函数。作为理论的补充,我们在涉及自由职业者零工平台数万份简历的技能分类任务上进行了系统实验。通过对类BERT语言模型进行超过两千次训练实验,我们观察到实证结果与理论预测之间呈现出显著的一致性。综合来看,我们的理论与实验共同表明:即使规模占比极小的算法集体,也能对平台的学习算法施加显著控制。