Every day, railways experience disturbances and disruptions, both on the network and the fleet side, that affect the stability of rail traffic. Induced delays propagate through the network, which leads to a mismatch in demand and offer for goods and passengers, and, in turn, to a loss in service quality. In these cases, it is the duty of human traffic controllers, the so-called dispatchers, to do their best to minimize the impact on traffic. However, dispatchers inevitably have a limited depth of perception of the knock-on effect of their decisions, particularly how they affect areas of the network that are outside their direct control. In recent years, much work in Decision Science has been devoted to developing methods to solve the problem automatically and support the dispatchers in this challenging task. This paper investigates Machine Learning-based methods for tackling this problem, proposing two different Deep Q-Learning methods(Decentralized and Centralized). Numerical results show the superiority of these techniques with respect to the classical linear Q-Learning based on matrices. Moreover, the Centralized approach is compared with a MILP formulation showing interesting results. The experiments are inspired by data provided by a U.S. Class 1 railroad.
翻译:铁道系统每日都会遭遇网络与车队层面的扰动与中断,这些状况直接影响铁路运输的稳定性。由此引发的延误在网络中持续扩散,导致客货运供需失衡,最终造成服务质量下降。在此类情形下,人类交通调度员(即列车调度员)需竭力将交通影响降至最低。然而,调度员对决策连锁效应的感知深度必然存在局限,尤其难以预判其对非直接管控区域网络造成的影响。近年来,决策科学领域投入大量研究,致力于开发自动化求解该问题的方法,为调度员执行此项艰巨任务提供支持。本文探究基于机器学习的方法应对该问题,提出两种深度Q学习算法(分散式与集中式)。数值结果表明,相较于基于矩阵的传统线性Q学习方法,本文技术具有显著优越性。此外,将集中式方法与混合整数线性规划(MILP)模型进行对比,呈现出令人瞩目的结果。实验数据源自美国一级铁路公司的真实运营数据。