This paper explores the design of a balanced data-sharing marketplace for entities with heterogeneous datasets and machine learning models that they seek to refine using data from other agents. The goal of the marketplace is to encourage participation for data sharing in the presence of such heterogeneity. Our market design approach for data sharing focuses on interim utility balance, where participants contribute and receive equitable utility from refinement of their models. We present such a market model for which we study computational complexity, solution existence, and approximation algorithms for welfare maximization and core stability. We finally support our theoretical insights with simulations on a mean estimation task inspired by road traffic delay estimation.
翻译:本文探讨了一种平衡的数据共享市场设计,适用于拥有异构数据集和机器学习模型的实体,这些实体希望利用其他智能体的数据来优化自身模型。该市场的目标是在存在这种异构性的情况下鼓励数据共享参与。我们的数据共享市场设计方法侧重于中间效用均衡,即参与者通过模型优化贡献并获得公平的效用。我们提出了这样一种市场模型,并研究了其计算复杂性、解的存在性以及面向福利最大化和核心稳定性的近似算法。最后,我们通过一项受道路交通延迟估计启发的均值估计任务进行仿真,验证了理论洞见。