In the area of learning-driven artificial intelligence advancement, the integration of machine learning (ML) into self-driving (SD) technology stands as an impressive engineering feat. Yet, in real-world applications outside the confines of controlled laboratory scenarios, the deployment of self-driving technology assumes a life-critical role, necessitating heightened attention from researchers towards both safety and efficiency. To illustrate, when a self-driving model encounters an unfamiliar environment in real-time execution, the focus must not solely revolve around enhancing its anticipated performance; equal consideration must be given to ensuring its execution or real-time adaptation maintains a requisite level of safety. This study introduces an algorithm for online meta-reinforcement learning, employing lookahead symbolic constraints based on \emph{Neurosymbolic Meta-Reinforcement Lookahead Learning} (NUMERLA). NUMERLA proposes a lookahead updating mechanism that harmonizes the efficiency of online adaptations with the overarching goal of ensuring long-term safety. Experimental results demonstrate NUMERLA confers the self-driving agent with the capacity for real-time adaptability, leading to safe and self-adaptive driving under non-stationary urban human-vehicle interaction scenarios.
翻译:在学习驱动的人工智能发展领域,机器学习与自动驾驶技术的融合是一项令人瞩目的工程壮举。然而,在受控实验室场景之外的现实应用中,自动驾驶技术承担着生命关键的角色,这要求研究者在安全与效率两方面给予更高关注。以自动驾驶模型在实时执行中遭遇陌生环境为例,其关注点不应仅局限于提升预期性能,还需确保执行或实时适应性过程维持必要的安全水平。本研究提出一种基于"神经符号元强化前瞻学习"的在线元强化学习算法。该算法通过前瞻符号约束,提出一种兼顾在线适应效率与长期安全目标的前向更新机制。实验结果表明,神经符号元强化前瞻学习赋予自动驾驶代理实时适应能力,使其能够在非平稳的城市人机交互场景中实现安全且自适应的驾驶。