While supervised learning assumes the presence of labeled data, we may have prior information about how models should behave. In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of our models. For what models would explanations be helpful? Our first key contribution addresses this question via the definition of what we call EPAC models (models that satisfy these constraints in expectation over new data), and we analyze this class of models using standard learning theoretic tools. Our second key contribution is to characterize these restrictions (in terms of their Rademacher complexities) for a canonical class of explanations given by gradient information for linear models and two layer neural networks. Finally, we provide an algorithmic solution for our framework, via a variational approximation that achieves better performance and satisfies these constraints more frequently, when compared to simpler augmented Lagrangian methods to incorporate these explanations. We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
翻译:虽然监督学习假设存在标注数据,但我们可能拥有关于模型应如何行为的先验信息。本文将这些概念形式化为基于解释约束的学习,并提供一个学习理论框架来分析此类解释如何改进模型的学习。对于哪些模型而言解释会有所帮助?我们的第一个关键贡献通过定义所谓的EPAC模型(即在新数据期望上满足这些约束的模型),并利用标准学习理论工具分析这类模型来回答该问题。第二个关键贡献是对于由线性模型和两层神经网络的梯度信息给出的典型解释类别,刻画了这些约束的拉德马赫复杂度特征。最后,我们为该框架提供一种算法解决方案——通过变分近似方法,在满足约束频率和性能表现上均优于简单的增广拉格朗日方法。我们在大量合成数据与真实世界实验中验证了该方法的效果。