The concept class of low-degree polynomial threshold functions (PTFs) plays a fundamental role in machine learning. In this paper, we study PAC learning of $K$-sparse degree-$d$ PTFs on $\mathbb{R}^n$, where any such concept depends only on $K$ out of $n$ attributes of the input. Our main contribution is a new algorithm that runs in time $({nd}/{\epsilon})^{O(d)}$ and under the Gaussian marginal distribution, PAC learns the class up to error rate $\epsilon$ with $O(\frac{K^{4d}}{\epsilon^{2d}} \cdot \log^{5d} n)$ samples even when an $\eta \leq O(\epsilon^d)$ fraction of them are corrupted by the nasty noise of Bshouty et al. (2002), possibly the strongest corruption model. Prior to this work, attribute-efficient robust algorithms are established only for the special case of sparse homogeneous halfspaces. Our key ingredients are: 1) a structural result that translates the attribute sparsity to a sparsity pattern of the Chow vector under the basis of Hermite polynomials, and 2) a novel attribute-efficient robust Chow vector estimation algorithm which uses exclusively a restricted Frobenius norm to either certify a good approximation or to validate a sparsity-induced degree-$2d$ polynomial as a filter to detect corrupted samples.
翻译:低次多项式阈值函数(PTFs)的概念类在机器学习中具有基础性地位。本文研究 $\mathbb{R}^n$ 上 $K$ 稀疏 $d$ 次 PTF 的 PAC 学习问题,其中每个此类概念仅依赖于输入的 $n$ 个属性中的 $K$ 个。我们的主要贡献在于提出一种新算法,其运行时间为 $({nd}/{\epsilon})^{O(d)}$,在高斯边际分布下,即使样本中有 $\eta \leq O(\epsilon^d)$ 的比例受到 Bshouty 等人 (2002) 提出的恶劣噪声(可能最强的污染模型)破坏,该算法仍能以 $O(\frac{K^{4d}}{\epsilon^{2d}} \cdot \log^{5d} n)$ 个样本实现错误率 $\epsilon$ 的 PAC 学习。在此之前,属性高效鲁棒算法仅针对稀疏齐次半空间这一特例建立。我们的关键技术包括:1) 一个结构性结果,将属性稀疏性转化为埃尔米特多项式基下 Chow 向量的稀疏模式;2) 一种新颖的属性高效鲁棒 Chow 向量估计算法,该算法仅使用受限 Frobenius 范数来验证良好近似,或验证一个由稀疏性导出的 $2d$ 次多项式作为检测受污染样本的过滤器。