We study the problem of computationally efficient proper agnostic learning of multidimensional concept classes under the Gaussian distribution. In this setting, given i.i.d. labeled samples from an unknown distribution over $\mathbb{R}^d \times \{\pm 1\}$ whose marginal on $\mathbb{R}^d$ is Gaussian, the goal is to output a hypothesis from a target class $\mathcal{F}$ whose 0-1 loss is within $ε$ of that of the best classifier in $\mathcal{F}$. We give the first efficient proper agnostic learning algorithm for arbitrary Boolean functions of $K$ halfspaces under Gaussian marginals. Our algorithm runs in time $d^{O(K^2 \log(1/ε)/ε^2)} + (K/ε)^{O(K^3/ε^{2.5})}$. Prior to our work, the only known algorithm for $K \geq 2$ was brute-force search, with run-time exponential in $d$. Moreover, the dependence of our run-time on the dimension $d$ matches that of the best known improper learning algorithm, namely $d^{\widetilde{O}(K^2/ε^2)}$. For the special case of a single halfspace ($K=1$), the best previous run-time was $d^{O(1/ε^4)} + (1/ε)^{O(1/ε^6)}$. Our algorithm improves this to $d^{O(1/ε^2)} + (1/ε)^{O(1/ε^{2.5})}$. Once again, the dependence on $d$ matches that of the best known improper algorithm, namely $d^{O(1/ε^2)}$. Furthermore, the dependence of our run-time on the dimension $d$ is essentially optimal in the statistical query model.
翻译:我们研究在高斯分布下高效恰当不可知学习多维概念类的问题。在此设定中,给定来自$\mathbb{R}^d \times \{\pm 1\}$上未知分布的独立同分布标记样本(其$\mathbb{R}^d$边缘分布为高斯分布),目标是输出目标类$\mathcal{F}$中的一个假设,使其0-1损失与$\mathcal{F}$中最优分类器的损失相差在$ε$以内。我们首次给出了高斯边际下任意$K$个半空间的布尔函数的高效恰当不可知学习算法。该算法的运行时间为$d^{O(K^2 \log(1/ε)/ε^2)} + (K/ε)^{O(K^3/ε^{2.5})}$。在我们工作之前,对于$K \geq 2$的情况,唯一已知的算法是暴力搜索,其运行时间随维度$d$呈指数增长。此外,我们运行时间对维度$d$的依赖性与已知最优非恰当学习算法$d^{\widetilde{O}(K^2/ε^2)}$相匹配。对于单个半空间($K=1$)这一特例,此前最优运行时间为$d^{O(1/ε^4)} + (1/ε)^{O(1/ε^6)}$。我们的算法将其改进为$d^{O(1/ε^2)} + (1/ε)^{O(1/ε^{2.5})}$。同样,对$d$的依赖性与已知最优非恰当算法$d^{O(1/ε^2)}$相匹配。此外,在统计查询模型中,我们的运行时间对维度$d$的依赖本质上是最优的。