Background. Federated learning (FL) has gained wide popularity as a collaborative learning paradigm enabling collaborative AI in sensitive healthcare applications. Nevertheless, the practical implementation of FL presents technical and organizational challenges, as it generally requires complex communication infrastructures. In this context, consensus-based learning (CBL) may represent a promising collaborative learning alternative, thanks to the ability of combining local knowledge into a federated decision system, while potentially reducing deployment overhead. Methods. In this work we propose an extensive benchmark of the accuracy and cost-effectiveness of a panel of FL and CBL methods in a wide range of collaborative medical data analysis scenarios. The benchmark includes 7 different medical datasets, encompassing 3 machine learning tasks, 8 different data modalities, and multi-centric settings involving 3 to 23 clients. Findings. Our results reveal that CBL is a cost-effective alternative to FL. When compared across the panel of medical dataset in the considered benchmark, CBL methods provide equivalent accuracy to the one achieved by FL.Nonetheless, CBL significantly reduces training time and communication cost (resp. 15 fold and 60 fold decrease) (p < 0.05). Interpretation. This study opens a novel perspective on the deployment of collaborative AI in real-world applications, whereas the adoption of cost-effective methods is instrumental to achieve sustainability and democratisation of AI by alleviating the need for extensive computational resources.
翻译:背景。联邦学习(FL)作为一种协作学习范式,在敏感医疗应用中实现协作人工智能方面获得了广泛关注。然而,FL的实际实施面临着技术和组织上的挑战,因为它通常需要复杂的通信基础设施。在此背景下,基于共识的学习(CBL)可能是一种有前景的协作学习替代方案,这得益于其将局部知识整合到联邦决策系统中的能力,同时可能降低部署开销。方法。在本工作中,我们提出了一个广泛的基准测试,用于评估一系列FL和CBL方法在多种协作医疗数据分析场景中的准确性和成本效益。该基准测试包含7个不同的医疗数据集,涵盖3种机器学习任务、8种不同的数据模态,以及涉及3到23个客户端的多中心设置。结果。我们的结果表明,CBL是FL的一种成本效益高的替代方案。在所考虑基准测试中的医疗数据集面板上进行比较时,CBL方法提供的准确性与FL所达到的准确性相当。尽管如此,CBL显著减少了训练时间和通信成本(分别降低了15倍和60倍)(p < 0.05)。解读。本研究为在现实世界应用中部署协作人工智能开辟了一个新的视角,其中采用成本效益高的方法有助于实现人工智能的可持续性和民主化,因为它减轻了对大量计算资源的需求。