Re-thinking Federated Active Learning based on Inter-class Diversity

Although federated learning has made awe-inspiring advances, most studies have assumed that the client's data are fully labeled. However, in a real-world scenario, every client may have a significant amount of unlabeled instances. Among the various approaches to utilizing unlabeled data, a federated active learning framework has emerged as a promising solution. In the decentralized setting, there are two types of available query selector models, namely 'global' and 'local-only' models, but little literature discusses their performance dominance and its causes. In this work, we first demonstrate that the superiority of two selector models depends on the global and local inter-class diversity. Furthermore, we observe that the global and local-only models are the keys to resolving the imbalance of each side. Based on our findings, we propose LoGo, a FAL sampling strategy robust to varying local heterogeneity levels and global imbalance ratio, that integrates both models by two steps of active selection scheme. LoGo consistently outperforms six active learning strategies in the total number of 38 experimental settings.

翻译：尽管联邦学习取得了令人瞩目的进展，但大多数研究假设客户端的数据是完全标注的。然而，在现实场景中，每个客户端可能拥有大量未标注样本。在利用未标注数据的多种方法中，联邦主动学习框架已成为一种有前景的解决方案。在分散式设定下，存在两种可用的查询选择器模型，即“全局”模型和“仅本地”模型，但鲜有文献探讨其性能优势及其成因。在这项工作中，我们首先证明这两种选择器模型的优越性取决于全局和本地的类别间差异性。此外，我们观察到全局模型和仅本地模型是解决各自侧不平衡性的关键。基于我们的发现，我们提出LoGo，一种对局部异质性程度和全局不平衡比例具有鲁棒性的联邦主动学习采样策略，该策略通过两步主动选择方案整合了这两种模型。在总计38组实验设定中，LoGo始终优于六种主动学习策略。

相关内容

主动学习

关注 243

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

【腾讯等】可信赖图学习：可靠性、可解释性和隐私保护，A Survey of Trustworthy Graph Learning: Reliability, Explainability, and Privacy Protection

专知会员服务

20+阅读 · 2022年5月24日

【CVPR 2022】单黑箱和多黑箱预测的领域适应，DINE: Domain Adaptation from Single and Multiple Black-box Predictors

专知会员服务

14+阅读 · 2022年3月12日

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

专知会员服务

20+阅读 · 2021年11月13日

【KDD2021】检索交互机的表格数据预测

专知会员服务

16+阅读 · 2021年8月13日