Modeling and prediction of epidemic spread are critical to assist in policy-making for mitigation. Therefore, we present a new method based on Gaussian Process Regression to model and predict epidemics, and it quantifies prediction confidence through variance and high probability error bounds. Gaussian Process Regression excels in using small datasets and providing uncertainty bounds, and both of these properties are critical in modeling and predicting epidemic spreading processes with limited data. However, the derivation of formal uncertainty bounds remains lacking when using Gaussian Process Regression in the setting of epidemics, which limits its usefulness in guiding mitigation efforts. Therefore, in this work, we develop a novel bound on the variance of the prediction that quantifies the impact of the epidemic data on the predictions we make. Further, we develop a high probability error bound on the prediction, and we quantify how the epidemic spread, the infection data, and the length of the prediction horizon all affect this error bound. We also show that the error stays below a certain threshold based on the length of the prediction horizon. To illustrate this framework, we leverage Gaussian Process Regression to model and predict COVID-19 using real-world infection data from the United Kingdom.
翻译:流行病传播的建模与预测对于制定缓解政策的决策至关重要。为此,我们提出了一种基于高斯过程回归的新方法来建模和预测流行病,并通过方差和高概率误差界量化预测置信度。高斯过程回归在处理小数据集和提供不确定性边界方面具有优势,这两个特性在利用有限数据建模和预测流行病传播过程中至关重要。然而,在流行病背景下使用高斯过程回归时,正式不确定性边界的推导仍然缺乏,这限制了其在指导缓解措施中的实用性。因此,在本工作中,我们发展了一种新的预测方差边界,用于量化流行病数据对预测结果的影响。此外,我们推导了预测的高概率误差界,并量化了流行病传播、感染数据以及预测时间跨度如何共同影响该误差界。我们还证明了误差始终低于基于预测时间跨度的特定阈值。为阐释该框架,我们利用高斯过程回归结合英国真实感染数据对COVID-19进行了建模与预测。