LESS-VFL: Communication-Efficient Feature Selection for Vertical Federated Learning

We propose LESS-VFL, a communication-efficient feature selection method for distributed systems with vertically partitioned data. We consider a system of a server and several parties with local datasets that share a sample ID space but have different feature sets. The parties wish to collaboratively train a model for a prediction task. As part of the training, the parties wish to remove unimportant features in the system to improve generalization, efficiency, and explainability. In LESS-VFL, after a short pre-training period, the server optimizes its part of the global model to determine the relevant outputs from party models. This information is shared with the parties to then allow local feature selection without communication. We analytically prove that LESS-VFL removes spurious features from model training. We provide extensive empirical evidence that LESS-VFL can achieve high accuracy and remove spurious features at a fraction of the communication cost of other feature selection approaches.

翻译：我们提出LESS-VFL，一种面向纵向划分数据分布式系统的通信高效特征选择方法。考虑一个包含服务器和多个参与方的系统，各参与方拥有共享样本ID空间但特征集不同的本地数据集，这些参与方希望针对某项预测任务协同训练模型。在训练过程中，参与方希望移除系统中的不重要特征以提升泛化能力、效率和可解释性。在LESS-VFL中，经过短期预训练后，服务器优化其全局模型部分以确定参与方模型的相关输出，并将该信息共享给参与方，从而在不产生通信开销的情况下实现本地特征选择。我们通过理论分析证明LESS-VFL能够从模型训练中移除虚假特征，并通过大量实验证据表明，LESS-VFL能以远低于其他特征选择方法的通信成本实现高精度并消除虚假特征。

相关内容

特征选择

关注 5940

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

35+阅读 · 2022年3月5日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日