In learning-enabled autonomous systems, safety monitoring of learned components is crucial to ensure their outputs do not lead to system safety violations, given the operational context of the system. However, developing a safety monitor for practical deployment in real-world applications is challenging. This is due to limited access to internal workings and training data of the learned component. Furthermore, safety monitors should predict safety violations with low latency, while consuming a reasonable amount of computation. To address the challenges, we propose a safety monitoring method based on probabilistic time series forecasting. Given the learned component outputs and an operational context, we empirically investigate different Deep Learning (DL)-based probabilistic forecasting to predict the objective measure capturing the satisfaction or violation of a safety requirement (safety metric). We empirically evaluate safety metric and violation prediction accuracy, and inference latency and resource usage of four state-of-the-art models, with varying horizons, using autonomous aviation and autonomous driving case studies. Our results suggest that probabilistic forecasting of safety metrics, given learned component outputs and scenarios, is effective for safety monitoring. Furthermore, for both case studies, Temporal Fusion Transformer (TFT) was the most accurate model for predicting imminent safety violations, with acceptable latency and resource consumption.
翻译:在学习驱动的自主系统中,对学习组件的安全监控至关重要,以确保在系统运行环境下,其输出不会导致系统安全违规。然而,为实际部署开发一个安全监控器具有挑战性。这主要是由于对学习组件内部工作机制和训练数据的访问受限。此外,安全监控器应以低延迟预测安全违规,同时消耗合理的计算资源。为应对这些挑战,我们提出了一种基于概率时间序列预测的安全监控方法。给定学习组件的输出和运行环境,我们通过实证研究,采用多种基于深度学习(DL)的概率预测方法来预测捕获安全需求满足或违反情况的客观度量(安全度量)。我们通过自主航空和自动驾驶两个案例研究,实证评估了四种前沿模型在不同预测时域下的安全度量与违规预测准确性、推理延迟及资源使用情况。结果表明,基于学习组件输出和场景的安全度量概率预测对于安全监控是有效的。此外,在这两个案例研究中,时序融合Transformer(TFT)在预测即将发生的安全违规方面是最准确的模型,且具有可接受的延迟和资源消耗。