Advancements in simulation and formal methods-guided environment sampling have enabled the rigorous evaluation of machine learning models in a number of safety-critical scenarios, such as autonomous driving. Application of these environment sampling techniques towards improving the learned models themselves has yet to be fully exploited. In this work, we introduce a novel method for improving imitation-learned models in a semantically aware fashion by leveraging specification-guided sampling techniques as a means of aggregating expert data in new environments. Specifically, we create a set of formal specifications as a means of partitioning the space of possible environments into semantically similar regions, and identify elements of this partition where our learned imitation behaves most differently from the expert. We then aggregate expert data on environments in these identified regions, leading to more accurate imitation of the expert's behavior semantics. We instantiate our approach in a series of experiments in the CARLA driving simulator, and demonstrate that our approach leads to models that are more accurate than those learned with other environment sampling methods.
翻译:仿真与形式化方法引导的环境采样技术的进步,使得在诸如自动驾驶等安全关键场景中,对机器学习模型进行严格评估成为可能。然而,将这些环境采样技术用于改进学习模型本身的应用尚未得到充分挖掘。本文提出一种新颖方法,通过利用规范引导的采样技术在新环境中聚合专家数据,以语义感知的方式改进模仿学习模型。具体而言,我们构建一组形式化规范来将可能的环境空间划分为语义相似的区域,并识别该划分中学习到的模仿行为与专家行为差异最大的区域。随后,我们聚合这些被识别区域中环境上的专家数据,从而实现对专家行为语义的更精确模仿。我们在CARLA驾驶模拟器中开展系列实验验证该方法,结果表明,相较于其他环境采样方法,本文方法能获得更精确的模型。