Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community. JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components in response to each individual's time varying state. In this work, we explore the application of reinforcement learning methods to the problem of learning intervention option selection policies. We study the effect of context inference error and partial observability on the ability to learn effective policies. Our results show that the propagation of uncertainty from context inferences is critical to improving intervention efficacy as context uncertainty increases, while policy gradient algorithms can provide remarkable robustness to partially observed behavioral state information.
翻译:即时自适应干预(JITAIs)是行为科学领域提出的一类个性化健康干预方法,旨在通过根据个体时变状态,从预定义的组件集合中迭代选择干预选项序列,提供适宜类型与支持强度。本研究探讨了将强化学习方法应用于学习干预选项选择策略的问题,重点分析了上下文推理误差与部分可观测性对有效策略学习能力的影响。结果表明:随着上下文不确定性的增加,从上下文推断中传播不确定性对于提升干预效果至关重要;而策略梯度算法在面对部分可观测的行为状态信息时,可展现出显著的鲁棒性。