Despite the extent of recent advances in Machine Learning (ML) and Neural Networks, providing formal guarantees on the behavior of these systems is still an open problem, and a crucial requirement for their adoption in regulated or safety-critical scenarios. We consider the task of training differentiable ML models guaranteed to satisfy designer-chosen properties, stated as input-output implications. This is very challenging, due to the computational complexity of rigorously verifying and enforcing compliance in modern neural models. We provide an innovative approach based on three components: 1) a general, simple architecture enabling efficient verification with a conservative semantic; 2) a rigorous training algorithm based on the Projected Gradient Method; 3) a formulation of the problem of searching for strong counterexamples. The proposed framework, being only marginally affected by model complexity, scales well to practical applications, and produces models that provide full property satisfaction guarantees. We evaluate our approach on properties defined by linear inequalities in regression, and on mutually exclusive classes in multilabel classification. Our approach is competitive with a baseline that includes property enforcement during preprocessing, i.e. on the training data, as well as during postprocessing, i.e. on the model predictions. Finally, our contributions establish a framework that opens up multiple research directions and potential improvements.
翻译:尽管机器学习(ML)与神经网络领域近年来取得了显著进展,但为这些系统的行为提供形式化保证仍是一个开放性问题,也是其在受监管或安全关键场景中应用的关键前提。本文研究如何训练可微的ML模型,并确保其满足设计者选择的属性——这些属性以输入-输出蕴含关系的形式表述。由于现代神经模型在严格验证与强制合规方面存在计算复杂性,该任务极具挑战性。我们提出一种基于三个组成部分的创新方法:1)一种通用且简洁的架构,能够在保守语义下实现高效验证;2)基于投影梯度法的严格训练算法;3)针对强反例搜索问题的形式化表述。所提出的框架受模型复杂度影响极小,能够良好扩展至实际应用,并生成具备完全属性满足保证的模型。我们在由回归任务中的线性不等式定义属性以及多标签分类中的互斥类别属性上评估了本方法。与在预处理阶段(即对训练数据)和后处理阶段(即对模型预测)均实施属性约束的基线方法相比,我们的方法具有竞争力。最后,本文的贡献建立了一个框架,为后续多个研究方向与潜在改进开辟了道路。