We introduce CollisionPro, a pioneering framework designed to estimate cumulative collision probability distributions using temporal difference learning, specifically tailored to applications in robotics, with a particular emphasis on autonomous driving. This approach addresses the demand for explainable artificial intelligence (XAI) and seeks to overcome limitations imposed by model-based approaches and conservative constraints. We formulate our framework within the context of reinforcement learning to pave the way for safety-aware agents. Nevertheless, we assert that our approach could prove beneficial in various contexts, including a safety alert system or analytical purposes. A comprehensive examination of our framework is conducted using a realistic autonomous driving simulator, illustrating its high sample efficiency and reliable prediction capabilities for previously unseen collision events. The source code is publicly available.
翻译:我们提出了CollisionPro,这是一个开创性的框架,旨在使用时序差分学习来估计累积碰撞概率分布,特别针对机器人学应用而设计,并重点关注自动驾驶领域。该方法响应了可解释人工智能的需求,并致力于克服基于模型的方法和保守约束所带来的局限。我们在强化学习的背景下构建此框架,为构建具备安全意识的智能体铺平道路。尽管如此,我们主张该方法在多种场景下都可能具有价值,包括安全预警系统或分析用途。我们使用一个真实的自动驾驶模拟器对框架进行了全面检验,结果表明其对未见过的碰撞事件具有高样本效率和可靠的预测能力。源代码已公开。