Safety is an essential asset when learning control policies for physical systems, as violating safety constraints during training can lead to expensive hardware damage. In response to this need, the field of safe learning has emerged with algorithms that can provide probabilistic safety guarantees without knowledge of the underlying system dynamics. Those algorithms often rely on Gaussian process inference. Unfortunately, Gaussian process inference scales cubically with the number of data points, limiting applicability to high-dimensional and embedded systems. In this paper, we propose a safe learning algorithm that provides probabilistic safety guarantees but leverages the Nadaraya-Watson estimator instead of Gaussian processes. For the Nadaraya-Watson estimator, we can reach logarithmic scaling with the number of data points. We provide theoretical guarantees for the estimates, embed them into a safe learning algorithm, and show numerical experiments on a simulated seven-degrees-of-freedom robot manipulator.
翻译:安全是学习物理系统控制策略时的关键要素,因为训练过程中违反安全约束可能导致昂贵的硬件损坏。为应对这一需求,安全学习领域涌现出多种算法,这些算法无需了解底层系统动态即可提供概率安全保证。此类算法通常依赖高斯过程推理,然而高斯过程推理的计算复杂度与数据点数量呈三次方关系,限制了其在高维和嵌入式系统中的应用。本文提出一种安全学习算法,该算法能够提供概率安全保证,但采用Nadaraya-Watson估计器替代高斯过程。对于Nadaraya-Watson估计器,我们可实现与数据点数量呈对数关系增长的复杂度。我们为估计值提供了理论保障,将其嵌入安全学习算法,并在模拟的七自由度机器人操作臂上进行了数值实验验证。