Collaborative learning techniques have significantly advanced in recent years, enabling private model training across multiple organizations. Despite this opportunity, firms face a dilemma when considering data sharing with competitors -- while collaboration can improve a company's machine learning model, it may also benefit competitors and hence reduce profits. In this work, we introduce a general framework for analyzing this data-sharing trade-off. The framework consists of three components, representing the firms' production decisions, the effect of additional data on model quality, and the data-sharing negotiation process, respectively. We then study an instantiation of the framework, based on a conventional market model from economic theory, to identify key factors that affect collaboration incentives. Our findings indicate a profound impact of market conditions on the data-sharing incentives. In particular, we find that reduced competition, in terms of the similarities between the firms' products, and harder learning tasks foster collaboration.
翻译:协作学习技术近年来取得了显著进展,使得跨组织的私有模型训练成为可能。尽管存在这一机遇,企业在考虑与竞争对手共享数据时仍面临两难困境——协作虽能提升公司机器学习模型性能,但也可能使竞争对手受益,进而降低利润。本文提出一个分析这种数据共享权衡的通用框架。该框架包含三个组成部分,分别代表企业的生产决策、新增数据对模型质量的影响以及数据共享协商过程。随后,我们基于经济理论中的传统市场模型研究了该框架的一个实例,以识别影响协作激励的关键因素。研究结果表明,市场条件对数据共享激励具有深远影响。具体而言,我们发现市场竞争程度(以企业产品相似性衡量)的减弱以及学习任务的难度提升会促进协作。