We introduce a continuous-time Markov chain framework for estimating population size from multi-list data, which allows directional interactions to be modelled and can accommodate absorbing lists, such as death records, or more general data collection processes. The standard model of the continuous-time Markov chain framework and the log-linear model for multi-list data are equivalent when lists are independent and we show empirically that they give similar results in the presence of dependencies between lists. Through a simulation study, we highlight the need to account for an absorbing list by using the Markov model or the log-linear model with forced absorbing interactions, observing biased estimates of the population size otherwise. We motivate our approach with an epidemiological dataset concerning individuals suffering from a first ever stroke in North-West England, in which one of the lists is a death record. We illustrate a further use of our approach by considering a case of ordered lists on drug use data from the City of London.
翻译:我们提出了一种基于连续时间马尔可夫链的框架,用于从多列表数据中估计总体规模,该框架可建模有向相互作用,并能够处理吸收列表(如死亡记录)或更一般的数据收集过程。当列表相互独立时,连续时间马尔可夫链框架的标准模型与应用于多列表数据的对数线性模型等价,且我们通过实证研究表明,在列表间存在依赖关系时,两者给出的结果相近。通过模拟研究,我们强调了必须使用马尔可夫模型或引入强制吸收相互作用项的对数线性模型来考虑吸收列表,否则会得到有偏的总体规模估计值。我们利用英格兰西北部首次中风患者人群的流行病学数据集(其中包含死亡记录这一列表)来论证该方法的必要性。此外,我们通过考虑伦敦市药物使用数据中列表有序的情况,进一步展示了该方法的应用。