Continuous Management of Machine Learning-Based Application Behavior

Modern applications are increasingly driven by Machine Learning (ML) models whose non-deterministic behavior is affecting the entire application life cycle from design to operation. The pervasive adoption of ML is urgently calling for approaches that guarantee a stable non-functional behavior of ML-based applications over time and across model changes. To this aim, non-functional properties of ML models, such as privacy, confidentiality, fairness, and explainability, must be monitored, verified, and maintained. Existing approaches mostly focus on i) implementing solutions for classifier selection according to the functional behavior of ML models, ii) finding new algorithmic solutions, such as continuous re-training. In this paper, we propose a multi-model approach that aims to guarantee a stable non-functional behavior of ML-based applications. An architectural and methodological approach is provided to compare multiple ML models showing similar non-functional properties and select the model supporting stable non-functional behavior over time according to (dynamic and unpredictable) contextual changes. Our approach goes beyond the state of the art by providing a solution that continuously guarantees a stable non-functional behavior of ML-based applications, is ML algorithm-agnostic, and is driven by non-functional properties assessed on the ML models themselves. It consists of a two-step process working during application operation, where model assessment verifies non-functional properties of ML models trained and selected at development time, and model substitution guarantees continuous and stable support of non-functional properties. We experimentally evaluate our solution in a real-world scenario focusing on non-functional property fairness.

翻译：现代应用程序日益由机器学习模型驱动，其非确定性行为正影响着从设计到运行的整个应用生命周期。机器学习的广泛采用迫切需要能够保证基于机器学习的应用程序在时间跨度和模型变更过程中保持稳定非功能行为的方法。为此，必须对机器学习模型的非功能属性（如隐私性、机密性、公平性和可解释性）进行监控、验证和维护。现有方法主要集中于：i) 根据机器学习模型的功能行为实现分类器选择方案；ii)寻找新的算法解决方案，例如持续再训练。本文提出一种多模型方法，旨在保证基于机器学习的应用程序具有稳定的非功能行为。我们提供了一种架构化方法论方法，用于比较多个展现相似非功能属性的机器学习模型，并根据（动态且不可预测的）上下文变化选择能够随时间推移支持稳定非功能行为的模型。我们的方法超越了现有技术水平，通过提供一种持续保证基于机器学习的应用程序稳定非功能行为的解决方案，该方案与机器学习算法无关，且由机器学习模型自身评估的非功能属性驱动。该方法包含在应用程序运行期间执行的两阶段流程：模型评估阶段验证开发阶段训练和选择的机器学习模型的非功能属性；模型替换阶段则保证非功能属性的持续稳定支持。我们通过在真实场景中聚焦非功能属性"公平性"的实验来评估所提出的解决方案。