This paper considers a stochastic control framework, in which the residual model uncertainty of the dynamical system is learned using a Gaussian Process (GP). In the proposed formulation, the residual model uncertainty consists of a nonlinear function and state-dependent noise. The proposed formulation uses a posterior-GP to approximate the residual model uncertainty and a prior-GP to account for state-dependent noise. The two GPs are interdependent and are thus learned jointly using an iterative algorithm. Theoretical properties of the iterative algorithm are established. Advantages of the proposed state-dependent formulation include (i) faster convergence of the GP estimate to the unknown function as the GP learns which data samples are more trustworthy and (ii) an accurate estimate of state-dependent noise, which can, e.g., be useful for a controller or decision-maker to determine the uncertainty of an action. Simulation studies highlight these two advantages.
翻译:本文考虑一个随机控制框架,其中动力系统的残差模型不确定性通过高斯过程(GP)进行学习。在所提出的公式中,残差模型不确定性由非线性函数和状态依赖噪声组成。该公式使用后验GP近似残差模型不确定性,并使用先验GP处理状态依赖噪声。这两个GP相互依赖,因此通过迭代算法共同学习。建立了该迭代算法的理论性质。所提出的状态依赖公式的优势包括:(i)随着GP学习哪些数据样本更可信,GP估计向未知函数的收敛速度更快;(ii)对状态依赖噪声的准确估计,例如,这有助于控制器或决策者确定某个动作的不确定性。仿真研究突出了这两个优势。