Rubinfeld & Vasilyan recently introduced the framework of testable learning as an extension of the classical agnostic model. It relaxes distributional assumptions which are difficult to verify by conditions that can be checked efficiently by a tester. The tester has to accept whenever the data truly satisfies the original assumptions, and the learner has to succeed whenever the tester accepts. We focus on the setting where the tester has to accept standard Gaussian data. There, it is known that basic concept classes such as halfspaces can be learned testably with the same time complexity as in the (distribution-specific) agnostic model. In this work, we ask whether there is a price to pay for testably learning more complex concept classes. In particular, we consider polynomial threshold functions (PTFs), which naturally generalize halfspaces. We show that PTFs of arbitrary constant degree can be testably learned up to excess error $\varepsilon > 0$ in time $n^{\mathrm{poly}(1/\varepsilon)}$. This qualitatively matches the best known guarantees in the agnostic model. Our results build on a connection between testable learning and fooling. In particular, we show that distributions that approximately match at least $\mathrm{poly}(1/\varepsilon)$ moments of the standard Gaussian fool constant-degree PTFs (up to error $\varepsilon$). As a secondary result, we prove that a direct approach to show testable learning (without fooling), which was successfully used for halfspaces, cannot work for PTFs.
翻译:Rubinfeld与Vasilyan近期提出了可测试学习框架,作为经典不可知模型的扩展。该框架通过可被测试器高效验证的条件,放宽了难以验证的分布假设。当数据真实满足原始假设时,测试器必须接受;而当测试器接受时,学习器必须成功。我们关注测试器必须接受标准高斯数据的设定。已知在该设定下,半空间等基本概念类可通过可测试学习实现与(分布特定的)不可知模型相同的时间复杂度。本文探讨了可测试学习更复杂概念类是否需要付出代价。我们特别研究了自然推广半空间的多项式阈值函数(PTF)。我们证明:对于任意常数阶数的PTF,可在$n^{\mathrm{poly}(1/\varepsilon)}$时间内实现误差$\varepsilon > 0$的可测试学习,该结果在定性层面与不可知模型中的最优已知保证相匹配。我们的结论建立在可测试学习与伪随机性之间的关联上:我们证明了与标准高斯分布至少匹配$\mathrm{poly}(1/\varepsilon)$阶矩的分布,能够以$\varepsilon$误差欺骗常数阶PTF。作为次要结果,我们论证了直接证明可测试学习的方法(不依赖伪随机性理论)——该方法曾成功应用于半空间学习——对PTF类不可行。