With the beginning of the COVID-19 pandemic, we became aware of the need for comprehensive data collection and its provision to scientists and experts for proper data analyses. In Germany, the Robert Koch Institute (RKI) has tried to keep up with this demand for data on COVID-19, but there were (and still are) relevant data missing that are needed to understand the whole picture of the pandemic. In this paper, we take a closer look at the severity of the course of COVID-19 in Germany, for which ideal information would be the number of incoming patients to ICU units. This information was (and still is) not available. Instead, the current occupancy of ICU units on the district level was reported daily. We demonstrate how this information can be used to predict the number of incoming as well as released COVID-19 patients using a stochastic version of the Expectation Maximisation algorithm (SEM). This in turn, allows for estimating the influence of district-specific and age-specific infection rates as well as further covariates, including spatial effects, on the number of incoming patients. The paper demonstrates that even if relevant data are not recorded or provided officially, statistical modelling allows for reconstructing them. This also includes the quantification of uncertainty which naturally results from the application of the SEM algorithm.
翻译:随着COVID-19大流行的爆发,我们认识到全面收集数据并将其提供给科学专家以进行恰当数据分析的必要性。在德国,罗伯特·科赫研究所(RKI)一直努力满足COVID-19数据需求,但过去(至今仍)缺少理解疫情全貌所需的关键数据。本文深入分析了德国COVID-19病程的严重程度,其中最理想的信息是ICU病房的转入患者人数,而这一信息过去(至今仍)无法获取。相反,每日报告的是地区层面ICU病房的当前占用情况。我们展示如何利用随机期望最大化算法(SEM),基于这些信息预测COVID-19转入和转出患者人数。这进而可以估计地区特异性感染率、年龄特异性感染率以及包括空间效应在内的其他协变量对转入患者人数的影响。本文表明,即使相关数据未被官方记录或提供,统计建模仍可实现数据重建,这还包括对SEM算法应用所自然产生的不确定性的量化。