We introduce a novel framework for learning context-aware runtime monitors for AI-based control ensembles. Machine-learning (ML) controllers are increasingly deployed in (autonomous) cyber-physical systems because of their ability to solve complex decision-making tasks. However, their accuracy can degrade sharply in unfamiliar environments, creating significant safety concerns. Traditional ensemble methods aim to improve robustness by averaging or voting across multiple controllers, yet this often dilutes the specialized strengths that individual controllers exhibit in different operating contexts. We argue that, rather than blending controller outputs, a monitoring framework should identify and exploit these contextual strengths. In this paper, we reformulate the design of safe AI-based control ensembles as a contextual monitoring problem. A monitor continuously observes the system's context and selects the controller best suited to the current conditions. To achieve this, we cast monitor learning as a contextual learning task and draw on techniques from contextual multi-armed bandits. Our approach comes with two key benefits: (1) theoretical safety guarantees during controller selection, and (2) improved utilization of controller diversity. We validate our framework in two simulated autonomous driving scenarios, demonstrating significant improvements in both safety and performance compared to non-contextual baselines.
翻译:本文提出了一种新颖的框架,用于为基于人工智能的控制集成系统学习上下文感知的运行时监控器。机器学习控制器因其解决复杂决策任务的能力,正日益被部署于(自主)信息物理系统中。然而,在陌生环境中,其准确性可能急剧下降,从而引发严重的安全隐患。传统的集成方法旨在通过多个控制器的平均或投票来提高鲁棒性,但这往往会削弱各个控制器在不同运行环境下所展现的专门优势。我们认为,监控框架不应混合控制器输出,而应识别并利用这些上下文相关的优势。在本文中,我们将设计安全的基于人工智能的控制集成系统重新表述为一个上下文监控问题。监控器持续观察系统的上下文,并选择最适合当前条件的控制器。为实现这一目标,我们将监控器学习建模为一个上下文学习任务,并借鉴上下文多臂赌博机的技术。我们的方法带来两个关键优势:(1)在控制器选择过程中提供理论上的安全保证;(2)提升对控制器多样性的利用效率。我们在两个模拟自动驾驶场景中验证了该框架,结果表明,与非上下文基线方法相比,其在安全性和性能方面均有显著提升。