Domain-adaptive trajectory imitation is a skill that some predators learn for survival, by mapping dynamic information from one domain (their speed and steering direction) to a different domain (current position of the moving prey). An intelligent agent with this skill could be exploited for a diversity of tasks, including the recognition of abnormal motion in traffic once it has learned to imitate representative trajectories. Towards this direction, we propose DATI, a deep reinforcement learning agent designed for domain-adaptive trajectory imitation using a cycle-consistent generative adversarial method. Our experiments on a variety of synthetic families of reference trajectories show that DATI outperforms baseline methods for imitation learning and optimal control in this setting, keeping the same per-task hyperparameters. Its generalization to a real-world scenario is shown through the discovery of abnormal motion patterns in maritime traffic, opening the door for the use of deep reinforcement learning methods for spatially-unconstrained trajectory data mining.
翻译:域自适应轨迹模仿是一些捕食者为生存而习得的技能,其通过将某一域(自身速度和转向方向)的动态信息映射到另一域(移动猎物的当前位置)来实现。具备此技能的智能体可被应用于多种任务,例如在学会模仿代表性轨迹后识别交通中的异常运动。为此,我们提出DATI——一种基于循环一致性生成对抗方法的深度强化学习智能体,专为域自适应轨迹模仿设计。我们在多种合成参考轨迹族上的实验表明,在保持相同任务超参数的情况下,DATI在该场景下优于模仿学习与最优控制的基线方法。通过发现海上交通中的异常运动模式,验证了该方法对真实场景的泛化能力,为深度强化学习方法在无空间约束的轨迹数据挖掘中的应用开辟了道路。