In large-scale multi-agent systems like taxi fleets, individual agents (taxi drivers) are self-interested (maximizing their own profits) and this can introduce inefficiencies in the system. One such inefficiency is with regard to the "required" availability of taxis at different time periods during the day. Since a taxi driver can work for a limited number of hours in a day (e.g., 8-10 hours in a city like Singapore), there is a need to optimize the specific hours, so as to maximize individual as well as social welfare. Technically, this corresponds to solving a large-scale multi-stage selfish routing game with transition uncertainty. Existing work in addressing this problem is either unable to handle ``driver" constraints (e.g., breaks during work hours) or not scalable. To that end, we provide a novel mechanism that builds on replicator dynamics through ideas from behavior cloning. We demonstrate that our methods provide significantly better policies than the existing approach in terms of improving individual agent revenue and overall agent availability.
翻译:在出租车车队等大规模多代理系统中,个体代理(出租车司机)具有自利性(最大化自身利润),这可能导致系统效率低下。其中一项效率问题涉及一天中不同时段出租车“规定”可用性的失衡。由于出租车司机每天工作时间有限(例如,在新加坡等城市通常为8-10小时),因此需要优化具体的工作时段,以最大化个体收益与社会整体福利。从技术角度来看,这相当于解决一个具有转移不确定性的多阶段自私路由博弈问题。现有研究要么无法处理“司机”约束(例如工作期间休息),要么缺乏可扩展性。为此,我们提出了一种新机制,该机制基于行为克隆的思想,结合复制者动力学构建而成。我们证明,相较于现有方法,我们的策略在提高个体代理收益及整体代理可用性方面显著更优。