Motion forecasting has become an increasingly critical component of autonomous robotic systems. Onboard compute budgets typically limit the accuracy of real-time systems. In this work we propose methods of improving motion forecasting systems subject to limited compute budgets by combining model ensemble and distillation techniques. The use of ensembles of deep neural networks has been shown to improve generalization accuracy in many application domains. We first demonstrate significant performance gains by creating a large ensemble of optimized single models. We then develop a generalized framework to distill motion forecasting model ensembles into small student models which retain high performance with a fraction of the computing cost. For this study we focus on the task of motion forecasting using real world data from autonomous driving systems. We develop ensemble models that are very competitive on the Waymo Open Motion Dataset (WOMD) and Argoverse leaderboards. From these ensembles, we train distilled student models which have high performance at a fraction of the compute costs. These experiments demonstrate distillation from ensembles as an effective method for improving accuracy of predictive models for robotic systems with limited compute budgets.
翻译:运动预测已成为自主机器人系统中愈发关键的核心组件。车载计算预算通常限制了实时系统的精度。本研究提出在有限计算预算下,通过结合模型集成与蒸馏技术来提升运动预测系统性能的方法。深度神经网络集成已被证明能够在诸多应用领域提升泛化精度。我们首先通过构建大规模优化单模型集成来展现显著性能提升,继而开发通用框架将运动预测模型集蒸馏为保持高性能、计算成本仅需数分之一的小型学生模型。本研究聚焦于基于自动驾驶系统真实数据的运动预测任务,所开发的集成模型在Waymo开放运动数据集(WOMD)与Argoverse排行榜上均具备极强竞争力。通过该集成模型训练得到的蒸馏学生模型,在计算成本仅数分之一的情况下仍保持高性能。这些实验证明,对于计算预算受限的机器人系统而言,集成蒸馏是提升预测模型准确性的有效方法。