With the popularization of AI solutions for image based problems, there has been a growing concern for both data privacy and acquisition. In a large number of cases, information is located on separate data silos and it can be difficult for a developer to consolidate all of it in a fashion that is appropriate for machine learning model development. Alongside this, a portion of these localized data regions may not have access to a labelled ground truth. This indicates that they have the capacity to reach conclusions numerically, but are not able to assign classifications amid a lack of pertinent information. Such a determination is often negligible, especially when attempting to develop image based solutions that often necessitate this capability. With this being the case, we propose an innovative vertical federated learning (VFL) model architecture that can operate under this common set of conditions. This is the first (and currently the only) implementation of a system that can work under the constraints of a VFL environment and perform image segmentation while maintaining nominal accuracies. We achieved this by utilizing an FCN that boasts the ability to operate on federates that lack labelled data and privately share the respective weights with a central server, that of which hosts the necessary features for classification. Tests were conducted on the CamVid dataset in order to determine the impact of heavy feature compression required for the transfer of information between federates, as well as to reach nominal conclusions about the overall performance metrics when working under such constraints.
翻译:随着基于图像的人工智能解决方案的普及,数据隐私与数据采集问题日益受到关注。在许多情况下,信息分散于不同的数据孤岛中,开发者难以将所有数据以适合机器学习模型开发的方式整合。与此同时,部分本地数据区域可能无法获取标注的真实值。这意味着它们虽具备数值推理能力,但因缺乏相关信息而无法进行分类判定。这种限制往往不容忽视,尤其是在开发通常需要分类能力的图像解决方案时。基于此,我们提出了一种创新的垂直联邦学习(VFL)模型架构,可在这一常见条件下运行。这是首个(也是目前唯一)能够在VFL环境约束下实现图像分割并保持名义准确率的系统实现。我们通过利用全卷积网络(FCN)实现这一目标,该网络能够处理缺乏标注数据的联邦节点,并将相应权重以隐私保护方式与中央服务器共享,而中央服务器则持有分类所需的特征。我们在CamVid数据集上进行了测试,以评估联邦节点间信息传输所需的重度特征压缩的影响,并在此类约束条件下得出关于整体性能指标的名义结论。