Deep policy networks enable robots to learn behaviors to solve various real-world complex tasks in an end-to-end fashion. However, they lack transparency to provide the reasons of actions. Thus, such a black-box model often results in low reliability and disruptive actions during the deployment of the robot in practice. To enhance its transparency, it is important to explain robot behaviors by considering the extent to which each input feature contributes to determining a given action. In this paper, we present an explicit analysis of deep policy models through input attribution methods to explain how and to what extent each input feature affects the decisions of the robot policy models. To this end, we present two methods for applying input attribution methods to robot policy networks: (1) we measure the importance factor of each joint torque to reflect the influence of the motor torque on the end-effector movement, and (2) we modify a relevance propagation method to handle negative inputs and outputs in deep policy networks properly. To the best of our knowledge, this is the first report to identify the dynamic changes of input attributions of multi-modal sensor inputs in deep policy networks online for robotic manipulation.
翻译:深度策略网络使机器人能够以端到端的方式学习执行各种现实复杂任务的行为。然而,这类模型缺乏行为决策原因的可解释性,导致黑箱模型在实际部署中常出现可靠性低及破坏性行为。为提升其透明度,需通过量化各输入特征对特定动作决策的贡献程度来解释机器人行为。本文通过输入归因方法对深度策略模型进行显式分析,以阐明各输入特征如何以及在多大程度上影响机器人策略模型的决策。为此,我们提出两种应用于机器人策略网络的输入归因方法:(1) 通过测量各关节力矩的重要性因子,反映电机力矩对末端执行器运动的影响;(2) 修正相关性传播方法,使其能够正确处理深度策略网络中的负输入与负输出。据我们所知,本文首次实现了机器人操作中多模态传感器输入在深度策略网络中的输入归因动态变化在线识别。