Solving chance-constrained stochastic optimal control problems is a significant challenge in control. This is because no analytical solutions exist for up to a handful of special cases. A common and computationally efficient approach for tackling chance-constrained stochastic optimal control problems consists of reformulating the chance constraints as hard constraints with a constraint-tightening parameter. However, in such approaches, the choice of constraint-tightening parameter remains challenging, and guarantees can mostly be obtained assuming that the process noise distribution is known a priori. Moreover, the chance constraints are often not tightly satisfied, leading to unnecessarily high costs. This work proposes a data-driven approach for learning the constraint-tightening parameters online during control. To this end, we reformulate the choice of constraint-tightening parameter for the closed-loop as a binary regression problem. We then leverage a highly expressive \gls{gp} model for binary regression to approximate the smallest constraint-tightening parameters that satisfy the chance constraints. By tuning the algorithm parameters appropriately, we show that the resulting constraint-tightening parameters satisfy the chance constraints up to an arbitrarily small margin with high probability. Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints in numerical experiments, resulting in a lower average cost than three other state-of-the-art approaches.
翻译:解决机会约束随机最优控制问题是控制领域中的一项重大挑战,这是因为除了少数特殊情况外,并不存在解析解。一种常见且计算高效的解决机会约束随机最优控制问题的方法,是将机会约束重新表述为带有约束收紧参数的硬约束。然而,在此类方法中,约束收紧参数的选择仍然具有挑战性,并且其保证大多只能在假设过程噪声分布先验已知的前提下获得。此外,机会约束往往不能严格满足,从而导致不必要的高成本。本文提出了一种数据驱动方法,用于在控制过程中在线学习约束收紧参数。为此,我们将闭环系统的约束收紧参数选择问题重新表述为二分类回归问题。然后,我们利用一种高表达性的高斯过程模型进行二分类回归,以近似满足机会约束的最小约束收紧参数。通过适当调整算法参数,我们证明得到的约束收紧参数能够以高概率满足机会约束,且与约束边界之间的误差可任意小。在我们的数值实验中,该方法得到的约束收紧参数严格满足了机会约束,并且平均成本低于其他三种最先进的方法。