Real-time path planning in constrained environments remains a fundamental challenge for autonomous systems. Traditional classical planners, while effective under perfect perception assumptions, are often sensitive to real-world perception constraints and rely on online search procedures that incur high computational costs. In complex surroundings, this renders real-time deployment prohibitive. To overcome these limitations, we introduce a Deep Reinforcement Learning (DRL) framework for real-time path planning in parking scenarios. In particular, we focus on challenging scenes with tight spaces that require a high number of reversal maneuvers and adjustments. Unlike classical planners, our solution does not require ideal and structured perception, and in principle, could avoid the need for additional modules such as localization and tracking, resulting in a simpler and more practical implementation. Also, at test time, the policy generates actions through a single forward pass at each step, which is lightweight enough for real-time deployment. The task is formulated as a sequential decision-making problem grounded in a bicycle model dynamics, enabling the agent to directly learn navigation policies that respect vehicle kinematics and environmental constraints in the closed-loop setting. A new benchmark is developed to support both training and evaluation, capturing diverse and challenging scenarios. Our approach achieves state-of-the-art success rates and efficiency, surpassing classical planner baselines by +96% in success rate and +52% in efficiency. Furthermore, we release our benchmark as an open-source resource for the community to foster future research in autonomous systems. The benchmark and accompanying tools are available at https://github.com/dqm5rtfg9b-collab/Constrained_Parking_Scenarios.
翻译:受限环境中的实时路径规划仍然是自主系统面临的基础性挑战。传统的经典规划器虽然在理想感知假设下有效,但通常对现实世界的感知约束较为敏感,且依赖在线搜索过程,导致高昂的计算成本。在复杂环境中,这使得实时部署变得不可行。为克服这些限制,我们提出了一种用于停车场景实时路径规划的深度强化学习框架。我们特别关注需要大量倒车操作与调整的狭窄空间挑战性场景。与经典规划器不同,我们的解决方案不需要理想化与结构化的感知,且原则上可避免对定位与跟踪等附加模块的依赖,从而实现更简洁实用的部署。此外,在测试阶段,策略通过单步前向传播生成动作,其轻量化特性足以满足实时部署需求。该任务被构建为基于自行车模型动力学的序列决策问题,使智能体能够在闭环设置中直接学习符合车辆运动学与环境约束的导航策略。我们开发了包含多样化挑战性场景的新基准测试集以支持训练与评估。本方法取得了最先进的成功率与效率,相较经典规划器基线在成功率上提升+96%,在效率上提升+52%。此外,我们将基准测试集作为开源资源发布,以促进自主系统领域的未来研究。相关资源可通过 https://github.com/dqm5rtfg9b-collab/Constrained_Parking_Scenarios 获取。