PAC Learning with Improvements

One of the most basic lower bounds in machine learning is that in nearly any nontrivial setting, it takes $\textit{at least}$ $1/\epsilon$ samples to learn to error $\epsilon$ (and more, if the classifier being learned is complex). However, suppose that data points are agents who have the ability to improve by a small amount if doing so will allow them to receive a (desired) positive classification. In that case, we may actually be able to achieve $\textit{zero}$ error by just being "close enough". For example, imagine a hiring test used to measure an agent's skill at some job such that for some threshold $\theta$, agents who score above $\theta$ will be successful and those who score below $\theta$ will not (i.e., learning a threshold on the line). Suppose also that by putting in effort, agents can improve their skill level by some small amount $r$. In that case, if we learn an approximation $\hat{\theta}$ of $\theta$ such that $\theta \leq \hat{\theta} \leq \theta + r$ and use it for hiring, we can actually achieve error zero, in the sense that (a) any agent classified as positive is truly qualified, and (b) any agent who truly is qualified can be classified as positive by putting in effort. Thus, the ability for agents to improve has the potential to allow for a goal one could not hope to achieve in standard models, namely zero error. In this paper, we explore this phenomenon more broadly, giving general results and examining under what conditions the ability of agents to improve can allow for a reduction in the sample complexity of learning, or alternatively, can make learning harder. We also examine both theoretically and empirically what kinds of improvement-aware algorithms can take into account agents who have the ability to improve to a limited extent when it is in their interest to do so.

翻译：机器学习中最基本的结论之一是：在几乎所有非平凡场景下，学习达到误差ε至少需要1/ε个样本（若待学习分类器较复杂，所需样本更多）。然而，假设数据点对应具有自主改进能力的智能体，当小幅改进能使其获得（期望的）正类分类时，它们便可能主动提升自身表现。在此情形下，我们实际上可能仅需达到“足够接近”即可实现零误差。例如，考虑用于评估智能体某项工作技能的招聘测试：存在阈值θ，得分高于θ的智能体将获得成功，低于θ则否（即学习直线上的阈值）。同时假设智能体通过付出努力可将其技能水平提升微小幅度r。此时，若我们学习得到θ的近似估计\hat{θ}满足θ ≤ \hat{θ} ≤ θ + r并将其用于招聘决策，则实际上可实现零误差，具体表现为：（a）任何被判定为正类的智能体确实具备资质；（b）任何真正合格的智能体均可通过付出努力被判定为正类。由此可见，智能体的改进能力可能实现传统模型无法企及的目标——零误差。本文更广泛地探讨这一现象，给出普适性结论，并研究在何种条件下智能体的改进能力能够降低学习所需的样本复杂度，或在何种情况下反而增加学习难度。我们同时从理论与实证角度考察各类改进感知算法，这些算法能够考虑智能体在利益驱动下进行有限程度自我改进的能力。