Workforce optimization plays a crucial role in efficient organizational operations where decision-making may span several different administrative and time scales. For instance, dispatching personnel to immediate service requests while managing talent acquisition with various expertise sets up a highly dynamic optimization problem. Existing work focuses on specific sub-problems such as resource allocation and facility location, which are solved with heuristics like local-search and, more recently, deep reinforcement learning. However, these may not accurately represent real-world scenarios where such sub-problems are not fully independent. Our aim is to fill this gap by creating a simulator that models a unified workforce optimization problem. Specifically, we designed a modular simulator to support the development of reinforcement learning methods for integrated workforce optimization problems. We focus on three interdependent aspects: personnel dispatch, workforce management, and personnel positioning. The simulator provides configurable parameterizations to help explore dynamic scenarios with varying levels of stochasticity and non-stationarity. To facilitate benchmarking and ablation studies, we also include heuristic and RL baselines for the above mentioned aspects.
翻译:人员优化在高效组织运营中扮演着关键角色,其决策过程可能跨越多个不同的行政层级与时间尺度。例如,在向即时服务请求派遣人员的同时,管理具备不同专业技能的人才招聘,构成了一个高度动态的优化问题。现有研究多聚焦于资源分配、设施选址等特定子问题,并采用局部搜索等启发式方法或近年兴起的深度强化学习进行求解。然而,这些方法可能无法准确反映现实场景中此类子问题并非完全独立的特性。本研究旨在填补这一空白,通过构建一个模拟统一人员优化问题的仿真环境来实现。具体而言,我们设计了一个模块化仿真器,以支持针对集成人员优化问题的强化学习方法开发。我们重点关注三个相互关联的维度:人员调度、人力管理与人员定位。该仿真器提供可配置的参数化方案,有助于探索具有不同随机性与非平稳性程度的动态场景。为促进基准测试与消融研究,我们还针对上述维度提供了启发式方法与强化学习基线模型。