To leverage data for the sufficient training of machine learning (ML) models from multiple parties in a confidentiality-preserving way, various collaborative distributed ML (CDML) system designs have been developed, for example, to perform assisted learning, federated learning, and split learning. CDML system designs show different traits, including high agent autonomy, ML model confidentiality, and fault tolerance. Facing a wide variety of CDML system designs with different traits, it is difficult for developers to design CDML systems with traits that match use case requirements in a targeted way. However, inappropriate CDML system designs may result in CDML systems failing their envisioned purposes. We developed a CDML design toolbox that can guide the development of CDML systems. Based on the CDML design toolbox, we present CDML system archetypes with distinct key traits that can support the design of CDML systems to meet use case requirements.
翻译:为以隐私保护方式利用多方数据充分训练机器学习模型,业界已开发出多种协作分布式机器学习系统设计方案,例如用于实现辅助学习、联邦学习和分割学习。协作分布式机器学习系统设计方案展现出不同特征,包括高度智能体自主性、机器学习模型机密性以及容错能力。面对具有不同特征的多样化协作分布式机器学习系统设计方案,开发者难以有针对性地设计出符合用例需求的协作分布式机器学习系统。然而,不当的协作分布式机器学习系统设计方案可能导致系统无法实现预期目标。我们开发了一个协作分布式机器学习设计工具箱,可指导协作分布式机器学习系统的开发。基于该设计工具箱,我们提出了具有显著关键特征的协作分布式机器学习系统原型,可支撑设计满足用例需求的协作分布式机器学习系统。