This paper considers the problem of evaluating an autonomous system's competency in performing a task, particularly when working in dynamic and uncertain environments. The inherent opacity of machine learning models, from the perspective of the user, often described as a `black box', poses a challenge. To overcome this, we propose using a measure called the Surprise index, which leverages available measurement data to quantify whether the dynamic system performs as expected. We show that the surprise index can be computed in closed form for dynamic systems when observed evidence in a probabilistic model if the joint distribution for that evidence follows a multivariate Gaussian marginal distribution. We then apply it to a nonlinear spacecraft maneuver problem, where actions are chosen by a reinforcement learning agent and show it can indicate how well the trajectory follows the required orbit.
翻译:本文考虑了评估自主系统在执行任务时的能力问题,特别是在动态和不确定环境中。机器学习模型从用户角度来看固有的不透明性(通常被称为“黑箱”)构成了挑战。为克服这一困难,我们提出使用一种称为“惊奇指标”的度量,该度量利用可用的测量数据来量化动态系统是否按预期运行。我们证明了在概率模型中,若观察证据的联合分布服从多元高斯边缘分布,则惊奇指标可对动态系统以闭合形式计算。随后将这一方法应用于非线性航天器机动问题,其中动作由强化学习智能体选择,结果表明该指标能够指示轨迹对所需轨道的符合程度。