CS4ML: A general framework for active learning with arbitrary data based on Christoffel functions

We introduce a general framework for active learning in regression problems. Our framework extends the standard setup by allowing for general types of data, rather than merely pointwise samples of the target function. This generalization covers many cases of practical interest, such as data acquired in transform domains (e.g., Fourier data), vector-valued data (e.g., gradient-augmented data), data acquired along continuous curves, and, multimodal data (i.e., combinations of different types of measurements). Our framework considers random sampling according to a finite number of sampling measures and arbitrary nonlinear approximation spaces (model classes). We introduce the concept of generalized Christoffel functions and show how these can be used to optimize the sampling measures. We prove that this leads to near-optimal sample complexity in various important cases. This paper focuses on applications in scientific computing, where active learning is often desirable, since it is usually expensive to generate data. We demonstrate the efficacy of our framework for gradient-augmented learning with polynomials, Magnetic Resonance Imaging (MRI) using generative models and adaptive sampling for solving PDEs using Physics-Informed Neural Networks (PINNs).

翻译：我们提出了一个适用于回归问题的主动学习通用框架。该框架将标准设置扩展至允许处理一般类型的数据，而不仅仅是目标函数的逐点样本。这一推广涵盖了许多实际应用场景，例如在变换域中获取的数据（如傅里叶数据）、向量值数据（如梯度增强数据）、沿连续曲线获取的数据以及多模态数据（即不同类型测量结果的组合）。本框架考虑根据有限个采样测度进行随机采样，并采用任意非线性逼近空间（模型类）。我们引入广义Christoffel函数的概念，并展示如何利用这些函数优化采样测度。我们证明，在多个重要情形下，该方法可实现近乎最优的样本复杂度。本文聚焦于科学计算领域的应用，该领域因数据生成成本高昂而常需采用主动学习策略。我们通过多项式梯度增强学习、基于生成模型的磁共振成像（MRI）以及利用物理信息神经网络（PINNs）求解偏微分方程的自适应采样，验证了本框架的有效性。

相关内容

主动学习

关注 243

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日