We tackle the challenge of large-scale network intervention for guiding excitatory point processes, such as infectious disease spread or traffic congestion control. Our model-based reinforcement learning utilizes neural ODEs to capture how the networked excitatory point processes will evolve subject to the time-varying changes in network topology. Our approach incorporates Gradient-Descent based Model Predictive Control (GD-MPC), offering policy flexibility to accommodate prior knowledge and constraints. To address the intricacies of planning and overcome the high dimensionality inherent to such decision-making problems, we design an Amortize Network Interventions (ANI) framework, allowing for the pooling of optimal policies from history and other contexts, while ensuring a permutation equivalent property. This property enables efficient knowledge transfer and sharing across diverse contexts. Our approach has broad applications, from curbing infectious disease spread to reducing carbon emissions through traffic light optimization, and thus has the potential to address critical societal and environmental challenges.
翻译:我们攻克了大规模网络干预的挑战,以引导兴奋性点过程(如传染病传播或交通拥堵控制)。我们的基于模型的强化学习利用神经常微分方程来捕捉网络兴奋性点过程如何在网络拓扑随时间变化的情况下演进。我们的方法采用了基于梯度下降的模型预测控制(GD-MPC),提供策略灵活性以适应先验知识和约束。为解决规划复杂性并克服此类决策问题固有的高维性,我们设计了摊销网络干预(ANI)框架,允许从历史和其他上下文中汇集最优策略,同时确保置换等变性。该特性支持跨不同上下文的高效知识迁移与共享。我们的方法具有广泛应用,从遏制传染病传播到通过交通灯优化减少碳排放,因此有望应对关键的社会和环境挑战。