We propose a method for learning decision-makers' behavior in routing problems using Inverse Optimization (IO). The IO framework falls into the supervised learning category and builds on the premise that the target behavior is an optimizer of an unknown cost function. This cost function is to be learned through historical data, and in the context of routing problems, can be interpreted as the routing preferences of the decision-makers. In this view, the main contributions of this study are to propose an IO methodology with a hypothesis function, loss function, and stochastic first-order algorithm tailored to routing problems. We further test our IO approach in the Amazon Last Mile Routing Research Challenge, where the goal is to learn models that replicate the routing preferences of human drivers, using thousands of real-world routing examples. Our final IO-learned routing model achieves a score that ranks 2nd compared with the 48 models that qualified for the final round of the challenge. Our examples and results showcase the flexibility and real-world potential of the proposed IO methodology to learn from decision-makers' decisions in routing problems.
翻译:我们提出了一种利用逆向优化(Inverse Optimization, IO)学习路径问题中决策者行为的方法。该IO框架属于监督学习范畴,其核心假设是目标行为对应于某个未知成本函数的最优解。该成本函数需通过历史数据进行学习,在路径问题背景下可理解为决策者的路径偏好。基于此,本研究的主要贡献在于:提出一种针对路径问题设计的逆向优化方法论,包含假设函数、损失函数及随机一阶算法。我们进一步在亚马逊最后一英里路径研究挑战赛中测试该IO方法,其目标是通过数千个真实路径案例学习能够复现人类驾驶员路径偏好的模型。最终,我们的IO学习路径模型与进入决赛的48个模型相比排名第二。实验案例与结果充分展示了所提IO方法在路径问题中从决策者行为中学习的灵活性与实际应用潜力。