The aspiration of the next generation's autonomous driving (AD) technology relies on the dedicated integration and interaction among intelligent perception, prediction, planning, and low-level control. There has been a huge bottleneck regarding the upper bound of autonomous driving algorithm performance, a consensus from academia and industry believes that the key to surmount the bottleneck lies in data-centric autonomous driving technology. Recent advancement in AD simulation, closed-loop model training, and AD big data engine have gained some valuable experience. However, there is a lack of systematic knowledge and deep understanding regarding how to build efficient data-centric AD technology for AD algorithm self-evolution and better AD big data accumulation. To fill in the identified research gaps, this article will closely focus on reviewing the state-of-the-art data-driven autonomous driving technologies, with an emphasis on the comprehensive taxonomy of autonomous driving datasets characterized by milestone generations, key features, data acquisition settings, etc. Furthermore, we provide a systematic review of the existing benchmark closed-loop AD big data pipelines from the industrial frontier, including the procedure of closed-loop frameworks, key technologies, and empirical studies. Finally, the future directions, potential applications, limitations and concerns are discussed to arouse efforts from both academia and industry for promoting the further development of autonomous driving.
翻译:下一代自动驾驶技术的愿景依赖于智能感知、预测、规划与底层控制之间的深度集成与交互。当前自动驾驶算法性能存在显著天花板瓶颈,学术界与工业界普遍认为突破这一瓶颈的关键在于以数据为中心的自动驾驶技术。近年来,自动驾驶仿真、闭环模型训练及自动驾驶大数据引擎等领域取得了宝贵进展。然而,针对如何构建高效的数据驱动型自动驾驶技术以实现算法自我进化与数据良性积累,目前仍缺乏系统性认知与深入理解。为填补上述研究空白,本文聚焦于数据驱动型自动驾驶前沿技术的综述,重点提出基于里程碑代次、关键特征、数据采集配置等维度的自动驾驶数据集分类体系。在此基础上,系统梳理了工业界前沿的闭环式自动驾驶大数据流水线基准方案,涵盖闭环框架实现流程、关键技术及实证研究。最后,本文探讨了未来发展方向、潜在应用场景、当前局限与待解难题,旨在激发学术界与工业界合力推动自动驾驶技术的进一步发展。