Conditional regression for the Nonlinear Single-Variable Model

Several statistical models for regression of a function $F$ on $\mathbb{R}^d$ without the statistical and computational curse of dimensionality exist, for example by imposing and exploiting geometric assumptions on the distribution of the data (e.g. that its support is low-dimensional), or strong smoothness assumptions on $F$, or a special structure $F$. Among the latter, compositional models assume $F=f\circ g$ with $g$ mapping to $\mathbb{R}^r$ with $r\ll d$, have been studied, and include classical single- and multi-index models and recent works on neural networks. While the case where $g$ is linear is rather well-understood, much less is known when $g$ is nonlinear, and in particular for which $g$'s the curse of dimensionality in estimating $F$, or both $f$ and $g$, may be circumvented. In this paper, we consider a model $F(X):=f(\Pi_\gamma X) $ where $\Pi_\gamma:\mathbb{R}^d\to[0,\rm{len}_\gamma]$ is the closest-point projection onto the parameter of a regular curve $\gamma: [0,\rm{len}_\gamma]\to\mathbb{R}^d$ and $f:[0,\rm{len}_\gamma]\to\mathbb{R}^1$. The input data $X$ is not low-dimensional, far from $\gamma$, conditioned on $\Pi_\gamma(X)$ being well-defined. The distribution of the data, $\gamma$ and $f$ are unknown. This model is a natural nonlinear generalization of the single-index model, which corresponds to $\gamma$ being a line. We propose a nonparametric estimator, based on conditional regression, and show that under suitable assumptions, the strongest of which being that $f$ is coarsely monotone, it can achieve the $one$-$dimensional$ optimal min-max rate for non-parametric regression, up to the level of noise in the observations, and be constructed in time $\mathcal{O}(d^2n\log n)$. All the constants in the learning bounds, in the minimal number of samples required for our bounds to hold, and in the computational complexity are at most low-order polynomials in $d$.

翻译：存在若干用于回归函数$F$在$\mathbb{R}^d$上的统计模型，这些模型避免了统计与计算上的维度灾难，例如通过对数据分布施加并利用几何假设（如其支撑集是低维的），或对$F$施加强光滑性假设，或要求$F$具有特殊结构。在后一类模型中，组合模型假设$F=f\circ g$，其中$g$映射到$\mathbb{R}^r$且$r\ll d$，已被广泛研究，包括经典的单索引与多索引模型以及近期关于神经网络的工作。尽管当$g$为线性时的情况已较为明确，但当$g$为非线性时，尤其是对于哪些$g$可以规避估计$F$或同时估计$f$和$g$时的维度灾难，所知甚少。本文考虑模型$F(X):=f(\Pi_\gamma X)$，其中$\Pi_\gamma:\mathbb{R}^d\to[0,\rm{len}_\gamma]$是到正则曲线$\gamma: [0,\rm{len}_\gamma]\to\mathbb{R}^d$参数上的最近点投影，且$f:[0,\rm{len}_\gamma]\to\mathbb{R}^1$。输入数据$X$并非低维，且远离$\gamma$，条件是其投影$\Pi_\gamma(X)$定义良好。数据分布、$\gamma$和$f$均未知。该模型是单索引模型（对应$\gamma$为直线情形）的自然非线性推广。我们提出一种基于条件回归的非参数估计器，并证明在适当假设下（其中最强假设为$f$是粗单调的），该估计器能够达到非参数回归的一维最优极小极大速率（直至观测噪声水平），且可在$\mathcal{O}(d^2n\log n)$时间内构建。学习界中的所有常数、保证界成立所需的最小样本数以及计算复杂度中的常数至多为$d$的低阶多项式。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日