Knowledge-Aware Federated Active Learning with Non-IID Data

Federated learning enables multiple decentralized clients to learn collaboratively without sharing the local training data. However, the expensive annotation cost to acquire data labels on local clients remains an obstacle in utilizing local data. In this paper, we propose a federated active learning paradigm to efficiently learn a global model with limited annotation budget while protecting data privacy in a decentralized learning way. The main challenge faced by federated active learning is the mismatch between the active sampling goal of the global model on the server and that of the asynchronous local clients. This becomes even more significant when data is distributed non-IID across local clients. To address the aforementioned challenge, we propose Knowledge-Aware Federated Active Learning (KAFAL), which consists of Knowledge-Specialized Active Sampling (KSAS) and Knowledge-Compensatory Federated Update (KCFU). KSAS is a novel active sampling method tailored for the federated active learning problem. It deals with the mismatch challenge by sampling actively based on the discrepancies between local and global models. KSAS intensifies specialized knowledge in local clients, ensuring the sampled data to be informative for both the local clients and the global model. KCFU, in the meantime, deals with the client heterogeneity caused by limited data and non-IID data distributions. It compensates for each client's ability in weak classes by the assistance of the global model. Extensive experiments and analyses are conducted to show the superiority of KSAS over the state-of-the-art active learning methods and the efficiency of KCFU under the federated active learning framework.

翻译：联邦学习使多个分散的客户端能够在无需共享本地训练数据的情况下协作学习。然而，本地客户端上获取数据标签所需的高额标注成本仍是利用本地数据的障碍。本文提出了一种联邦主动学习范式，以在保护数据隐私的分散学习方式下，利用有限的标注预算高效学习全局模型。联邦主动学习面临的主要挑战是服务器上全局模型的主动采样目标与异步本地客户端的主动采样目标之间的不匹配。当数据在各本地客户端上呈非独立同分布（Non-IID）分布时，这一问题尤为显著。为应对上述挑战，我们提出了知识感知的联邦主动学习（KAFAL），其由知识专化主动采样（KSAS）和知识补偿联邦更新（KCFU）组成。KSAS是一种专为联邦主动学习问题设计的新型主动采样方法，通过基于本地模型与全局模型之间的差异进行主动采样来解决不匹配挑战。KSAS强化了本地客户端中的专化知识，确保采样数据对本地客户端和全局模型均具有信息性。同时，KCFU处理了由有限数据和非独立同分布数据分布导致的客户端异质性，通过全局模型的辅助补偿每个客户端在弱类别上的能力。大量实验和分析表明，KSAS相较于最先进的主动学习方法具有优越性，且KCFU在联邦主动学习框架下具有高效性。

相关内容

主动学习

关注 243

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日