We present TreeIRL, a novel planner for autonomous driving that combines Monte Carlo tree search (MCTS) and inverse reinforcement learning (IRL) to achieve state-of-the-art performance in simulation and in real-world driving. The core idea is to use MCTS to find a promising set of safe candidate trajectories and a deep IRL scoring function to select the most human-like among them. We evaluate TreeIRL against both classical and state-of-the-art planners in large-scale simulations and on 500+ miles of real-world autonomous driving in the Las Vegas metropolitan area. Test scenarios include dense urban traffic, adaptive cruise control, cut-ins, and traffic lights. TreeIRL achieves the best overall performance, striking a balance between safety, progress, comfort, and human-likeness. To our knowledge, our work is the first demonstration of MCTS-based planning on public roads and underscores the importance of evaluating planners across a diverse set of metrics and in real-world environments. TreeIRL is highly extensible and could be further improved with reinforcement learning and imitation learning, providing a framework for exploring different combinations of classical and learning-based approaches to solve the planning bottleneck in autonomous driving.
翻译:本文提出TreeIRL,一种结合蒙特卡洛树搜索与逆强化学习的新型自动驾驶规划器,在仿真和实际驾驶中均实现了最先进的性能。其核心思想是利用MCTS生成一组安全可行的候选轨迹,并通过深度IRL评分函数从中选择最类人的轨迹。我们在大规模仿真以及拉斯维加斯大都会区500英里以上的实际自动驾驶测试中,将TreeIRL与经典及前沿规划器进行了对比评估。测试场景涵盖密集城市交通、自适应巡航控制、车辆切入及交通信号灯处理。TreeIRL在安全性、通行效率、舒适度及类人性方面取得了最佳综合表现。据我们所知,本研究首次实现了基于MCTS的规划器在公开道路上的实证,并强调了在多样化指标体系和真实环境中评估规划器的重要性。TreeIRL具备高度可扩展性,可通过强化学习与模仿学习进一步优化,为探索经典方法与学习型方法的多元组合提供了框架,以突破自动驾驶规划领域的瓶颈。