Collaborative learning techniques have significantly advanced in recent years, enabling private model training across multiple organizations. Despite this opportunity, firms face a dilemma when considering data sharing with competitors -- while collaboration can improve a company's machine learning model, it may also benefit competitors and hence reduce profits. In this work, we introduce a general framework for analyzing this data-sharing trade-off. The framework consists of three components, representing the firms' production decisions, the effect of additional data on model quality, and the data-sharing negotiation process, respectively. We then study an instantiation of the framework, based on a conventional market model from economic theory, to identify key factors that affect collaboration incentives. Our findings indicate a profound impact of market conditions on the data-sharing incentives. In particular, we find that reduced competition, in terms of the similarities between the firms' products, and harder learning tasks foster collaboration.
翻译:协作学习技术近年来取得显著进展,使多个组织能够实现私有模型训练。尽管存在这一机遇,企业在考虑与竞争对手共享数据时仍面临两难困境——协作虽然能改善企业的机器学习模型,但也可能使竞争对手受益,进而降低利润。本研究提出一个分析数据共享权衡关系的通用框架。该框架包含三个组成部分,分别对应企业的生产决策、新增数据对模型质量的影响以及数据共享谈判过程。我们随后基于经济学理论的经典市场模型对框架进行实例化研究,以识别影响协作动机的关键因素。研究结果表明,市场条件对数据共享动机具有深远影响。具体而言,我们发现产品同质化程度降低(即企业产品相似性)与学习任务难度增加会促进协作。