Identifying the Hazard Boundary of ML-enabled Autonomous Systems Using Cooperative Co-Evolutionary Search

In Machine Learning (ML)-enabled autonomous systems (MLASs), it is essential to identify the hazard boundary of ML Components (MLCs) in the MLAS under analysis. Given that such boundary captures the conditions in terms of MLC behavior and system context that can lead to hazards, it can then be used to, for example, build a safety monitor that can take any predefined fallback mechanisms at runtime when reaching the hazard boundary. However, determining such hazard boundary for an ML component is challenging. This is due to the space combining system contexts (i.e., scenarios) and MLC behaviors (i.e., inputs and outputs) being far too large for exhaustive exploration and even to handle using conventional metaheuristics, such as genetic algorithms. Additionally, the high computational cost of simulations required to determine any MLAS safety violations makes the problem even more challenging. Furthermore, it is unrealistic to consider a region in the problem space deterministically safe or unsafe due to the uncontrollable parameters in simulations and the non-linear behaviors of ML models (e.g., deep neural networks) in the MLAS under analysis. To address the challenges, we propose MLCSHE (ML Component Safety Hazard Envelope), a novel method based on a Cooperative Co-Evolutionary Algorithm (CCEA), which aims to tackle a high-dimensional problem by decomposing it into two lower-dimensional search subproblems. Moreover, we take a probabilistic view of safe and unsafe regions and define a novel fitness function to measure the distance from the probabilistic hazard boundary and thus drive the search effectively. We evaluate the effectiveness and efficiency of MLCSHE on a complex Autonomous Vehicle (AV) case study. Our evaluation results show that MLCSHE is significantly more effective and efficient compared to a standard genetic algorithm and random search.

翻译：在基于机器学习（ML）的自主系统（MLAS）中，识别分析对象中ML组件（MLC）的危险边界至关重要。由于该边界能够表征导致危险的MLC行为与系统上下文条件，因此可用于构建安全监控器，以便在系统接近危险边界时于运行时触发预定义的容错机制。然而，确定ML组件的危险边界面临多重挑战：系统上下文（即场景）与MLC行为（即输入/输出）构成的联合空间规模过大，不仅无法穷举探索，甚至难以通过遗传算法等常规元启发式方法处理；此外，为判定MLAS安全违规所需的高计算成本仿真进一步加剧了问题难度。更关键的是，由于仿真中不可控参数的存在以及ML模型中深度神经网络等非线性行为，将问题空间中的区域简单划分为确定性安全区或危险区并不现实。针对上述挑战，我们提出MLCSHE（ML组件安全危险包络）方法——一种基于协同进化算法（CCEA）的新型方法，通过将高维问题分解为两个低维搜索子问题来降低求解难度。同时，我们引入概率视角定义安全区域与危险区域，并设计新型适应度函数度量与概率危险边界的距离，从而有效引导搜索过程。我们在复杂自动驾驶车辆（AV）案例中评估了MLCSHE的有效性与效率，结果表明，相较于标准遗传算法与随机搜索，MLCSHE在效能与效率上均具有显著优势。