This study explores the potential of reinforcement learning algorithms to enhance career planning processes. Leveraging data from Randstad The Netherlands, the study simulates the Dutch job market and develops strategies to optimize employees' long-term income. By formulating career planning as a Markov Decision Process (MDP) and utilizing machine learning algorithms such as Sarsa, Q-Learning, and A2C, we learn optimal policies that recommend career paths with high-income occupations and industries. The results demonstrate significant improvements in employees' income trajectories, with RL models, particularly Q-Learning and Sarsa, achieving an average increase of 5% compared to observed career paths. The study acknowledges limitations, including narrow job filtering, simplifications in the environment formulation, and assumptions regarding employment continuity and zero application costs. Future research can explore additional objectives beyond income optimization and address these limitations to further enhance career planning processes.
翻译:本研究探讨了强化学习算法在提升职业规划过程中的潜力。通过利用荷兰任仕达公司(Randstad The Netherlands)的数据,本研究模拟了荷兰就业市场,并制定了优化员工长期收入的策略。通过将职业规划构建为马尔可夫决策过程(MDP),并运用Sarsa、Q-Learning和A2C等机器学习算法,我们学习到能够推荐高收入职业与行业路径的最优策略。结果表明,强化学习模型(尤其是Q-Learning和Sarsa)相比观察到的职业路径,使员工的收入轨迹平均提升了5%。研究指出了局限性,包括狭窄的工作筛选、环境建模的简化,以及关于就业连续性和零申请成本的假设。未来研究可在收入优化之外探索更多目标,并针对这些局限性进行改进,以进一步完善职业规划流程。