Artificial Intelligence and digital health have the potential to transform global health. However, having access to representative data to test and validate algorithms in realistic production environments is essential. We introduce HealthSyn, an open-source synthetic data generator of user behavior for testing reinforcement learning algorithms in the context of mobile health interventions. The generator utilizes Markov processes to generate diverse user actions, with individual user behavioral patterns that can change in reaction to personalized interventions (i.e., reminders, recommendations, and incentives). These actions are translated into actual logs using an ML-purposed data schema specific to the mobile health application functionality included with HealthKit, and open-source SDK. The logs can be fed to pipelines to obtain user metrics. The generated data, which is based on real-world behaviors and simulation techniques, can be used to develop, test, and evaluate, both ML algorithms in research and end-to-end operational RL-based intervention delivery frameworks.
翻译:人工智能与数字健康具有变革全球健康的潜力。然而,在真实生产环境中获取代表性数据以测试和验证算法至关重要。我们推出HealthSyn——一款针对移动健康干预场景下强化学习算法测试的开源用户行为合成数据生成器。该生成器利用马尔可夫过程生成多样化的用户行为,这些个体行为模式可根据个性化干预(如提醒、建议和激励)动态变化。通过HealthKit功能专属的机器学习数据模式及开源SDK,用户行为被转化为实际日志数据。这些日志可输入流水线以获取用户指标。基于真实世界行为与仿真技术生成的合成数据,可用于研发、测试和评估机器学习算法,以及构建端到端基于强化学习的运营级干预实施框架。