Machine teaching often involves the creation of an optimal (typically minimal) dataset to help a model (referred to as the `student') achieve specific goals given by a teacher. While abundant in the continuous domain, the studies on the effectiveness of machine teaching in the discrete domain are relatively limited. This paper focuses on machine teaching in the discrete domain, specifically on manipulating student models' predictions based on the goals of teachers via changing the training data efficiently. We formulate this task as a combinatorial optimization problem and solve it by proposing an iterative searching algorithm. Our algorithm demonstrates significant numerical merit in the scenarios where a teacher attempts at correcting erroneous predictions to improve the student's models, or maliciously manipulating the model to misclassify some specific samples to the target class aligned with his personal profits. Experimental results show that our proposed algorithm can have superior performance in effectively and efficiently manipulating the predictions of the model, surpassing conventional baselines.
翻译:机器教学通常涉及创建一个最优(通常是最小)数据集,以帮助模型(称为“学生”)达成教师指定的特定目标。尽管在连续域中研究丰富,但在离散域中关于机器教学有效性的研究相对有限。本文聚焦于离散域中的机器教学,具体探讨如何通过高效改变训练数据,基于教师目标操控学生模型的预测。我们将此任务形式化为一个组合优化问题,并提出一种迭代搜索算法进行求解。该算法在教师试图纠正错误预测以改进学生模型,或恶意操控模型使其为迎合个人利益而将特定样本误分类至目标类别等场景中展现出显著的数值优势。实验结果表明,我们提出的算法在有效且高效地操控模型预测方面具有优越性能,超越了传统基线方法。