Simulation frameworks have been key enablers for the development and validation of autonomous driving systems. However, existing methods struggle to comprehensively address the autonomy-oriented requirements of balancing: (i) dynamical fidelity, (ii) photorealistic rendering, (iii) context-relevant scenario orchestration, and (iv) real-time performance. To address these limitations, we present a unified framework for creating and curating high-fidelity digital twins to accelerate advancements in autonomous driving research. Our framework leverages a mix of physics-based and data-driven techniques for developing and simulating digital twins of autonomous vehicles and their operating environments. It is capable of reconstructing real-world scenes and assets with geometric and photorealistic accuracy (~97% structural similarity) and infusing them with physical properties to enable real-time (>60 Hz) dynamical simulation of the ensuing driving scenarios. Additionally, it incorporates a large language model (LLM) interface to flexibly edit the driving scenarios online via natural language prompts, with ~85% generalizability and ~95% repeatability. Finally, an optional vision language model (VLM) provides ~80% visual enhancement by blending the hybrid scene composition.
翻译:仿真框架一直是自动驾驶系统开发与验证的关键使能技术。然而,现有方法难以全面满足自动驾驶导向的需求,这些需求需要在以下方面取得平衡:(i) 动力学保真度,(ii) 照片级真实感渲染,(iii) 上下文相关的场景编排,以及 (iv) 实时性能。为应对这些局限性,我们提出了一个用于创建与管理高保真数字孪生的统一框架,以加速自动驾驶研究的进展。我们的框架综合利用基于物理和数据驱动的方法,来开发与仿真自动驾驶车辆及其运行环境的数字孪生。它能够以几何与视觉上的逼真度(结构相似度约97%)重建真实世界的场景与资产,并为其注入物理属性,从而实现对后续驾驶场景的实时(>60 Hz)动力学仿真。此外,该框架集成了一个大语言模型接口,可通过自然语言指令在线灵活地编辑驾驶场景,其泛化能力约达85%,可重复性约达95%。最后,一个可选的视觉语言模型通过融合混合场景构图,提供了约80%的视觉增强效果。