Estimating Treatment Effects Using Costly Simulation Samples from a Population-Scale Model of Opioid Use Disorder

from arxiv, To be presented in IEEE International Conference on Biomedical and Health Informatics 2023, repository link: https://github.com/abdulrahmanfci/intervention-estimation

Large-scale models require substantial computational resources for analysis and studying treatment conditions. Specifically, estimating treatment effects using simulations may require a lot of infeasible resources to allocate at every treatment condition. Therefore, it is essential to develop efficient methods to allocate computational resources for estimating treatment effects. Agent-based simulation allows us to generate highly realistic simulation samples. FRED (A Framework for Reconstructing Epidemiological Dynamics) is an agent-based modeling system with a geospatial perspective using a synthetic population constructed based on the U.S. census data. Given its synthetic population, FRED simulations present a baseline for comparable results from different treatment conditions and treatment conditions. In this paper, we show three other methods for estimating treatment effects. In the first method, we resort to brute-force allocation, where all treatment conditions have an equal number of samples with a relatively large number of simulation runs. In the second method, we try to reduce the number of simulation runs by customizing individual samples required for each treatment effect based on the width of confidence intervals around the mean estimates. In the third method, we use a regression model, which allows us to learn across the treatment conditions such that simulation samples allocated for a treatment condition will help better estimate treatment effects in other conditions. We show that the regression-based methods result in a comparable estimate of treatment effects with less computational resources. The reduced variability and faster convergence of model-based estimates come at the cost of increased bias, and the bias-variance trade-off can be controlled by adjusting the number of model parameters (e.g., including higher-order interaction terms in the regression model).

翻译：大规模模型需要大量计算资源进行分析和研究治疗条件。具体而言，使用仿真估算治疗效果时，可能需要在每种治疗条件下分配大量不可行的资源。因此，开发高效分配计算资源以估算治疗效果的方法至关重要。基于智能体的仿真使我们能够生成高度真实的仿真样本。FRED（流行病学动态重建框架）是一个采用地理空间视角的基于智能体的建模系统，其使用基于美国人口普查数据构建的合成人群。基于其合成人群，FRED仿真为不同治疗条件下的可比结果提供了基准。在本文中，我们展示了三种估算治疗效果的方法。第一种方法采用暴力分配策略，即所有治疗条件均拥有相同数量的样本，且每个条件进行相对大量仿真运行。第二种方法通过根据均值估计周围置信区间的宽度，为每种治疗效果定制所需个体样本数量，以减少仿真运行次数。第三种方法使用回归模型，使我们能够跨治疗条件进行学习，从而使分配给某一治疗条件的仿真样本有助于更好地估算其他条件下的治疗效果。结果表明，基于回归的方法能以更少的计算资源获得相当的治疗效果估计。模型估计的变异性降低和收敛速度加快是以偏差增加为代价的，而偏差-方差权衡可通过调整模型参数数量（例如在回归模型中包含高阶交互项）进行控制。