Identifying the Hazard Boundary of ML-enabled Autonomous Systems Using Cooperative Co-Evolutionary Search

In Machine Learning (ML)-enabled autonomous systems (MLASs), it is essential to identify the hazard boundary of ML Components (MLCs) in the MLAS under analysis. Given that such boundary captures the conditions in terms of MLC behavior and system context that can lead to hazards, it can then be used to, for example, build a safety monitor that can take any predefined fallback mechanisms at runtime when reaching the hazard boundary. However, determining such hazard boundary for an ML component is challenging. This is due to the problem space combining system contexts (i.e., scenarios) and MLC behaviors (i.e., inputs and outputs) being far too large for exhaustive exploration and even to handle using conventional metaheuristics, such as genetic algorithms. Additionally, the high computational cost of simulations required to determine any MLAS safety violations makes the problem even more challenging. Furthermore, it is unrealistic to consider a region in the problem space deterministically safe or unsafe due to the uncontrollable parameters in simulations and the non-linear behaviors of ML models (e.g., deep neural networks) in the MLAS under analysis. To address the challenges, we propose MLCSHE (ML Component Safety Hazard Envelope), a novel method based on a Cooperative Co-Evolutionary Algorithm (CCEA), which aims to tackle a high-dimensional problem by decomposing it into two lower-dimensional search subproblems. Moreover, we take a probabilistic view of safe and unsafe regions and define a novel fitness function to measure the distance from the probabilistic hazard boundary and thus drive the search effectively. We evaluate the effectiveness and efficiency of MLCSHE on a complex Autonomous Vehicle (AV) case study. Our evaluation results show that MLCSHE is significantly more effective and efficient compared to a standard genetic algorithm and random search.

翻译：在机器学习使能自主系统（MLAS）中，识别待分析系统内机器学习组件（MLC）的危险边界至关重要。该边界捕捉了可能导致危险的MLC行为与系统情境条件，可被用于构建安全监控器，当系统接近危险边界时，在运行时自动触发预定义的降级机制。然而，确定机器学习组件的危险边界极具挑战性。其根源在于，系统情境（即场景）与MLC行为（即输入与输出）所构成的解空间过于庞大，无法通过穷举搜索或传统元启发式算法（如遗传算法）有效处理。此外，模拟验证MLAS安全违规所需的高计算成本进一步加剧了问题难度。更关键的是，由于仿真中的不可控参数及MLAS中深度学习模型（如深度神经网络）的非线性行为，将解空间中的区域视为确定性安全或危险并不现实。为应对这些挑战，本文提出MLCSHE（机器学习组件安全危险包络）方法——基于协同共进算法（CCEA）的新型方案，通过将高维问题分解为两个低维搜索子问题来攻克维度灾难。同时，我们从概率视角定义安全与危险区域，设计新型适应度函数衡量与概率危险边界之间的距离，从而有效驱动搜索过程。我们以复杂自动驾驶车辆（AV）案例验证MLCSHE的有效性与效率。评估结果表明，相较于标准遗传算法与随机搜索，MLCSHE在效果与效率上均显著更优。