The dynamic vehicle dispatching problem corresponds to deciding which vehicles to assign to requests that arise stochastically over time and space. It emerges in diverse areas, such as in the assignment of trucks to loads to be transported; in emergency systems; and in ride-hailing services. In this paper, we model the problem as a semi-Markov decision process, which allows us to treat time as continuous. In this setting, decision epochs coincide with discrete events whose time intervals are random. We argue that an event-based approach substantially reduces the combinatorial complexity of the decision space and overcomes other limitations of discrete-time models often proposed in the literature. In order to test our approach, we develop a new discrete-event simulator and use double deep q-learning to train our decision agents. Numerical experiments are carried out in realistic scenarios using data from New York City. We compare the policies obtained through our approach with heuristic policies often used in practice. Results show that our policies exhibit better average waiting times, cancellation rates and total service times, with reduction in average waiting times of up to 50% relative to the other tested heuristic policies.
翻译:动态车辆调度问题对应于决定将哪些车辆分配给在时间和空间上随机出现的请求。该问题出现在多个领域,例如将卡车分配给需要运输的货物、应急系统以及网约车服务。在本文中,我们将该问题建模为半马尔可夫决策过程,从而能够将时间视为连续变量。在此设定下,决策时刻与时间间隔随机发生的离散事件重合。我们认为,基于事件的方法显著降低了决策空间的组合复杂度,并克服了文献中常提出的离散时间模型的其他局限性。为验证我们的方法,我们开发了一个新的离散事件仿真器,并使用双深度Q学习来训练决策智能体。利用纽约市的数据,我们在现实场景中进行了数值实验。我们将通过我们的方法获得的策略与实践中常用的启发式策略进行了比较。结果表明,我们的策略在平均等待时间、取消率和总服务时间方面表现更优,其中平均等待时间相比于其他测试的启发式策略降低了高达50%。