Agnostic Active Learning of Single Index Models with Linear Sample Complexity

We study active learning methods for single index models of the form $F({\mathbf x}) = f(\langle {\mathbf w}, {\mathbf x}\rangle)$, where $f:\mathbb{R} \to \mathbb{R}$ and ${\mathbf x,\mathbf w} \in \mathbb{R}^d$. In addition to their theoretical interest as simple examples of non-linear neural networks, single index models have received significant recent attention due to applications in scientific machine learning like surrogate modeling for partial differential equations (PDEs). Such applications require sample-efficient active learning methods that are robust to adversarial noise. I.e., that work even in the challenging agnostic learning setting. We provide two main results on agnostic active learning of single index models. First, when $f$ is known and Lipschitz, we show that $\tilde{O}(d)$ samples collected via {statistical leverage score sampling} are sufficient to learn a near-optimal single index model. Leverage score sampling is simple to implement, efficient, and already widely used for actively learning linear models. Our result requires no assumptions on the data distribution, is optimal up to log factors, and improves quadratically on a recent ${O}(d^{2})$ bound of \cite{gajjar2023active}. Second, we show that $\tilde{O}(d)$ samples suffice even in the more difficult setting when $f$ is \emph{unknown}. Our results leverage tools from high dimensional probability, including Dudley's inequality and dual Sudakov minoration, as well as a novel, distribution-aware discretization of the class of Lipschitz functions.

翻译：我们研究形如$F({\mathbf x}) = f(\langle {\mathbf w}, {\mathbf x}\rangle)$的单指标模型的主动学习方法，其中$f:\mathbb{R} \to \mathbb{R}$，${\mathbf x,\mathbf w} \in \mathbb{R}^d$。作为非线性神经网络的简单示例，单指标模型除了具有理论价值外，近年来还因在偏微分方程替代建模等科学机器学习中的应用而备受关注。这类应用要求主动学习方法具备样本高效性，且能抵御对抗性噪声——即需在具有挑战性的无偏学习场景中依然有效。本文提供关于单指标模型无偏主动学习的两个主要结果：首先，当$f$已知且满足Lipschitz条件时，我们证明通过统计杠杆分数采样收集的$\tilde{O}(d)$个样本足以学习近似最优的单指标模型。杠杆分数采样实现简便、计算高效，已广泛用于线性模型的主动学习。该结果无需对数据分布作假设，在对数因子意义下达到最优，且较近期文献\cite{gajjar2023active}中${O}(d^{2})$的界实现了二次改进。其次，我们证明即使$f$未知的更具挑战性场景下，$\tilde{O}(d)$个样本仍然足够。我们的结果借助高维概率工具（包括Dudley不等式与对偶Sudakov极小化原理）以及一类新颖的、基于分布感知的Lipschitz函数类离散化方法。

相关内容

主动学习

关注 243

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

48+阅读 · 2019年12月19日