An important challenge in robust machine learning is when training data is provided by strategic sources who may intentionally report erroneous data for their own benefit. A line of work at the intersection of machine learning and mechanism design aims to deter strategic agents from reporting erroneous training data by designing learning algorithms that are strategyproof. Strategyproofness is a strong and desirable property, but it comes at a cost in the approximation ratio of even simple risk minimization problems. In this paper, we study strategyproof regression and classification problems in a model with advice. This model is part of a recent line on mechanism design with advice where the goal is to achieve both an improved approximation ratio when the advice is correct (consistency) and a bounded approximation ratio when the advice is incorrect (robustness). We provide the first non-trivial consistency-robustness tradeoffs for strategyproof regression and classification, which hold for simple yet interesting classes of functions. For classes of constant functions, we give a deterministic and strategyproof mechanism that is, for any $\gamma \in (0, 2]$, $1+\gamma$ consistent and $1 + 4/\gamma$ robust and provide a lower bound that shows that this tradeoff is optimal. We extend this mechanism and its guarantees to homogeneous linear regression over $\mathbb{R}$. In the binary classification problem of selecting from three or more labelings, we present strong impossibility results for both deterministic and randomized mechanism. Finally, we provide deterministic and randomized mechanisms for selecting from two labelings.
翻译:在鲁棒机器学习中,一个重要挑战是训练数据可能由策略性来源提供,这些来源可能为自身利益而故意报告错误数据。机器学习与机制设计交叉领域的一系列工作旨在通过设计具有策略性证明的学习算法,来阻止策略性代理报告错误的训练数据。策略性证明是一种强大且理想的特性,但即使对于简单的风险最小化问题,它也会在近似比方面带来代价。本文研究了带有建议模型下的策略性证明回归与分类问题。该模型属于近期机制设计与建议研究的一部分,其目标在于当建议正确时实现改进的近似比(一致性),并在建议错误时保持有界的近似比(鲁棒性)。我们首次为策略性证明回归与分类提供了非平凡的“一致性-鲁棒性”权衡,这些结果适用于简单而有趣的函数类。对于常数函数类,我们给出了一种确定性且具有策略性证明的机制,该机制对于任意 $\gamma \in (0, 2]$,具有 $1+\gamma$ 的一致性和 $1 + 4/\gamma$ 的鲁棒性,并提供了一个下界证明该权衡是最优的。我们将此机制及其保证推广到 $\mathbb{R}$ 上的齐次线性回归问题。在从三个或更多标注中选择的二元分类问题中,我们对确定性机制和随机机制均给出了强不可能性结果。最后,我们为从两个标注中选择的问题提供了确定性和随机机制。