Various processes can be modelled as quasi-reaction systems of stochastic differential equations, such as cell differentiation and disease spreading. Since the underlying data of particle interactions, such as reactions between proteins or contacts between people, are typically unobserved, statistical inference of the parameters driving these systems is developed from concentration data measuring each unit in the system over time. While observing the continuous time process at a time scale as fine as possible should in theory help with parameter estimation, the existing Local Linear Approximation (LLA) methods fail in this case, due to numerical instability caused by small changes of the system at successive time points. On the other hand, one may be able to reconstruct the underlying unobserved interactions from the observed count data. Motivated by this, we first formalise the latent event history model underlying the observed count process. We then propose a computationally efficient Expectation-Maximation algorithm for parameter estimation, with an extended Kalman filtering procedure for the prediction of the latent states. A simulation study shows the performance of the proposed method and highlights the settings where it is particularly advantageous compared to the existing LLA approaches. Finally, we present an illustration of the methodology on the spreading of the COVID-19 pandemic in Italy.
翻译:多种过程(例如细胞分化和疾病传播)可被建模为随机微分方程的准反应系统。由于粒子相互作用(如蛋白质间的反应或人际接触)的底层数据通常无法观测,因此驱动这些系统的参数统计推断需基于对系统中每个单元随时间变化的浓度数据进行。理论上,以尽可能精细的时间尺度观测连续时间过程应有助于参数估计,但现有的局部线性近似(LLA)方法在此情况下会因连续时间点间系统的微小变化导致数值不稳定而失效。另一方面,我们或许能从观测的计数数据中重建未观测的底层相互作用。基于此,我们首先形式化了观测计数过程背后的潜在事件历史模型。接着,我们提出了一种计算高效的期望最大化算法用于参数估计,并采用扩展卡尔曼滤波程序预测潜在状态。仿真研究展示了所提方法的性能,并强调了相较于现有LLA方法具有显著优势的设定场景。最后,我们以COVID-19疫情在意大利的传播为例说明了该方法的实际应用。