A novel approach of a deep reinforcement learning based motion cueing algorithm for vehicle driving simulation

In the field of motion simulation, the level of immersion strongly depends on the motion cueing algorithm (MCA), as it transfers the reference motion of the simulated vehicle to a motion of the motion simulation platform (MSP). The challenge for the MCA is to reproduce the motion perception of a real vehicle driver as accurately as possible without exceeding the limits of the workspace of the MSP in order to provide a realistic virtual driving experience. In case of a large discrepancy between the perceived motion signals and the optical cues, motion sickness may occur with the typical symptoms of nausea, dizziness, headache and fatigue. Existing approaches either produce non-optimal results, e.g., due to filtering, linearization, or simplifications, or the required computational time exceeds the real-time requirements of a closed-loop application. In this work a new solution is presented, where not a human designer specifies the principles of the MCA but an artificial intelligence (AI) learns the optimal motion by trial and error in an interaction with the MSP. To achieve this, deep reinforcement learning (RL) is applied, where an agent interacts with an environment formulated as a Markov decision process~(MDP). This allows the agent to directly control a simulated MSP to obtain feedback on its performance in terms of platform workspace usage and the motion acting on the simulator user. The RL algorithm used is proximal policy optimization (PPO), where the value function and the policy corresponding to the control strategy are learned and both are mapped in artificial neural networks (ANN). This approach is implemented in Python and the functionality is demonstrated by the practical example of pre-recorded lateral maneuvers. The subsequent validation on a standardized double lane change shows that the RL algorithm is able to learn the control strategy and improve the quality of...

翻译：在运动仿真领域，沉浸感水平高度依赖于运动提示算法（MCA），该算法将仿真车辆的参考运动转换为运动仿真平台（MSP）的运动。MCA面临的挑战是尽可能准确地复现真实车辆驾驶员的运动感知，同时不超出MSP工作空间的限制，以提供逼真的虚拟驾驶体验。当感知运动信号与视觉线索之间存在较大差异时，可能引发运动病，伴有恶心、眩晕、头痛和疲劳等典型症状。现有方法或因滤波、线性化或简化而产生非最优结果，或所需计算时间超出闭环应用的实时性要求。本文提出一种新解决方案，其中MCA的原理并非由人类设计者指定，而是由人工智能（AI）通过与MSP的交互试错学习最优运动。为此，采用深度强化学习（RL），其中智能体与一个描述为马尔可夫决策过程（MDP）的环境进行交互。这使得智能体能够直接控制仿真的MSP，获取平台工作空间使用情况和作用于仿真用户运动的性能反馈。所使用的RL算法是近端策略优化（PPO），其中对应于控制策略的值函数和策略均被学习，并映射到人工神经网络（ANN）中。该方法以Python实现，并通过预录制的横向机动实际示例展示其功能。随后在标准化双移线工况上的验证表明，RL算法能够学习控制策略并提升运动仿真质量...