We present ReinDriveGen, a framework that enables full controllability over dynamic driving scenes, allowing users to freely edit actor trajectories to simulate safety-critical corner cases such as front-vehicle collisions, drifting cars, vehicles spinning out of control, pedestrians jaywalking, and cyclists cutting across lanes. Our approach constructs a dynamic 3D point cloud scene from multi-frame LiDAR data, introduces a vehicle completion module to reconstruct full 360° geometry from partial observations, and renders the edited scene into 2D condition images that guide a video diffusion model to synthesize realistic driving videos. Since such edited scenarios inevitably fall outside the training distribution, we further propose an RL-based post-training strategy with a pairwise preference model and a pairwise reward mechanism, enabling robust quality improvement under out-of-distribution conditions without ground-truth supervision. Extensive experiments demonstrate that ReinDriveGen outperforms existing approaches on edited driving scenarios and achieves state-of-the-art results on novel ego viewpoint synthesis.
翻译:我们提出了ReinDriveGen框架,该框架能够实现对动态驾驶场景的完全可控性,使用户能够自由编辑交通参与者轨迹以模拟安全关键边缘案例(如前方车辆碰撞、车辆漂移、车辆失控旋转、行人乱穿马路及自行车横向穿行车道)。该方法从多帧激光雷达数据构建动态3D点云场景,引入车辆补全模块从局部观测重建完整360°几何结构,并将编辑后的场景渲染为2D条件图像,引导视频扩散模型生成逼真的驾驶视频。由于此类编辑场景必然偏离训练数据分布,我们进一步提出基于强化学习的后训练策略,结合成对偏好模型与成对奖励机制,在无真实标注条件下实现分布外场景的稳健质量提升。大量实验表明,ReinDriveGen在编辑驾驶场景上优于现有方法,并在新视角合成任务中取得最优结果。