Learning Intersections of Halfspaces with Distribution Shift: Improved Algorithms and SQ Lower Bounds

Recent work of Klivans, Stavropoulos, and Vasilyan initiated the study of testable learning with distribution shift (TDS learning), where a learner is given labeled samples from training distribution $\mathcal{D}$, unlabeled samples from test distribution $\mathcal{D}'$, and the goal is to output a classifier with low error on $\mathcal{D}'$ whenever the training samples pass a corresponding test. Their model deviates from all prior work in that no assumptions are made on $\mathcal{D}'$. Instead, the test must accept (with high probability) when the marginals of the training and test distributions are equal. Here we focus on the fundamental case of intersections of halfspaces with respect to Gaussian training distributions and prove a variety of new upper bounds including a $2^{(k/\epsilon)^{O(1)}} \mathsf{poly}(d)$-time algorithm for TDS learning intersections of $k$ homogeneous halfspaces to accuracy $\epsilon$ (prior work achieved $d^{(k/\epsilon)^{O(1)}}$). We work under the mild assumption that the Gaussian training distribution contains at least an $\epsilon$ fraction of both positive and negative examples ($\epsilon$-balanced). We also prove the first set of SQ lower-bounds for any TDS learning problem and show (1) the $\epsilon$-balanced assumption is necessary for $\mathsf{poly}(d,1/\epsilon)$-time TDS learning for a single halfspace and (2) a $d^{\tilde{\Omega}(\log 1/\epsilon)}$ lower bound for the intersection of two general halfspaces, even with the $\epsilon$-balanced assumption. Our techniques significantly expand the toolkit for TDS learning. We use dimension reduction and coverings to give efficient algorithms for computing a localized version of discrepancy distance, a key metric from the domain adaptation literature.

翻译：Klivans、Stavropoulos和Vasilyan近期的工作开创了带分布偏移的可测试学习（TDS学习）研究。在该框架中，学习者获得来自训练分布$\mathcal{D}$的带标签样本和来自测试分布$\mathcal{D}'$的无标签样本，目标是在训练样本通过相应测试时，输出一个在$\mathcal{D}'$上具有低误差的分类器。该模型与以往所有工作的区别在于：无需对$\mathcal{D}'$施加任何假设，取而代之的是——当训练分布与测试分布的边缘分布相等时，测试必须以高概率接受。本文聚焦于高斯训练分布下半空间交集这一基础案例，证明了多种新的上界，包括一个$2^{(k/\epsilon)^{O(1)}} \mathsf{poly}(d)$时间复杂度的算法，用于TDS学习$k$个齐次半空间的交集达到精度$\epsilon$（此前工作需$d^{(k/\epsilon)^{O(1)}}$）。我们在温和假设下开展工作：高斯训练分布中至少包含$\epsilon$比例的正例和负例（$\epsilon$-平衡）。此外，我们证明了TDS学习问题的首批SQ下界，并表明：（1）对于单个半空间的TDS学习，$\epsilon$-平衡假设是实现$\mathsf{poly}(d,1/\epsilon)$时间算法所必需的；（2）即使施加$\epsilon$-平衡假设，两个一般半空间的交集存在$d^{\tilde{\Omega}(\log 1/\epsilon)}$的下界。我们的方法显著扩展了TDS学习的工具集，通过维度约简与覆盖技术，为计算域适应文献中的核心度量——局部化版本差异距离——提供了高效算法。