Uncertainty quantification is essential when deploying learning-based control methods in safety-critical systems. This is commonly realized by constructing uncertainty tubes that enclose the unknown function of interest, e.g., the reward and constraint functions or the underlying dynamics model, with high probability. However, existing approaches for uncertainty quantification typically rely on restrictive assumptions on the unknown function, such as known bounds on functional norms or Lipschitz constants, and struggle with discontinuities. In this paper, we model the unknown function as a random function from which independent and identically distributed realizations can be generated, and construct uncertainty tubes via the scenario approach that hold with high probability and rely solely on the sampled realizations. We integrate these uncertainty tubes into a safe Bayesian optimization algorithm, which we then use to safely tune control parameters on a real Furuta pendulum.
翻译:不确定性量化对于在安全关键系统中部署基于学习的控制方法至关重要。这通常通过构建以高概率包含未知函数(如奖励和约束函数或底层动力学模型)的不确定性管来实现。然而,现有的不确定性量化方法通常对未知函数施加了限制性假设,例如函数范数或利普希茨常数的已知界,并且难以处理不连续性。在本文中,我们将未知函数建模为一个随机函数,该函数可以生成独立同分布的样本实现,并通过情景方法构建以高概率成立且仅依赖于采样实现的不确定性管。我们将这些不确定性管集成到一种安全的贝叶斯优化算法中,进而用于在真实的Furuta摆上安全地调整控制参数。