Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, with a notably reduced per-message processing time. The implementation of our simulator - JAX-LOB - is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs.
翻译:全球金融交易所广泛采用限价订单簿(LOB)处理订单并撮合交易。为开展研究,亟需构建大规模高效的LOB动态模拟器。此前LOB模拟器已在基于智能体的模型(ABMs)、强化学习(RL)环境及生成式模型中得到实现,既可处理历史数据集生成的订单流,也能处理人工设计的智能体订单流。针对许多应用场景,无论是ABM参数校准还是RL智能体训练,都需要并行处理多个订单簿。我们首次展示了基于GPU的LOB模拟器,其能够并行处理数千个订单簿,且单条消息处理时间显著降低。该模拟器(JAX-LOB)的实现基于特定设计原则,旨在充分挖掘JAX框架的性能优势,同时保持LOB相关机制的逼真性。我们将JAX-LOB与其他JAX软件包集成,以示范如何通过强化学习解决最优执行问题,并分享在GPU上进行端到端RL训练的初步结果。