To leverage training data for the sufficient training of ML models from multiple parties in a confidentiality-preserving way, various collaborative distributed machine learning (CDML) system designs have been developed, for example, to perform assisted learning, federated learning, and split learning. CDML system designs show different traits, for example, high agent autonomy, machine learning (ML) model confidentiality, and fault tolerance. Facing a wide variety of CDML system designs with different traits, it is difficult for developers to design CDML systems with traits that match use case requirements in a targeted way. However, inappropriate CDML system designs may result in CDML systems failing their envisioned purposes. We developed a CDML design toolbox that can guide the development of CDML systems. Based on the CDML design toolbox, we present CDML system archetypes with distinct key traits that can support the design of CDML systems to meet use case requirements.
翻译:为借助多方训练数据以保密方式充分训练机器学习模型,业界已开发出多种协作分布式机器学习(CDML)系统设计方案,例如用于实现辅助学习、联邦学习和分割学习。CDML 系统设计表现出不同特性,如高智能体自主性、机器学习(ML)模型保密性和容错性。面对具有不同特性的多样化 CDML 系统设计,开发人员难以有针对性地设计特性匹配用例要求的 CDML 系统。然而,不恰当的 CDML 系统设计可能导致系统无法实现预期目标。我们开发了一种能够指导 CDML 系统开发的 CDML 设计工具箱。基于该设计工具箱,我们提出了具有独特关键特性的 CDML 系统原型,能支持设计满足用例要求的 CDML 系统。