More than half of the world's population is exposed to the risk of mosquito-borne diseases, which leads to millions of cases and hundreds of thousands of deaths every year. Analyzing this type of data is often complex and poses several interesting challenges, mainly due to the vast geographic area, the peculiar temporal behavior, and the potential correlation between infections. Motivation stems from the analysis of tropical diseases data, namely, the number of cases of two arboviruses, dengue and chikungunya, transmitted by the same mosquito, for all the 145 microregions in Southeast Brazil from 2018 to 2022. As a contribution to the literature on multivariate disease data, we develop a flexible Bayesian multivariate spatio-temporal model where temporal dependence is defined for areal clusters. The model features a prior distribution for the random partition of areal data that incorporates neighboring information, thus encouraging maps with few contiguous clusters and discouraging clusters with disconnected areas. The model also incorporates an autoregressive structure and terms related to seasonal patterns into temporal components that are disease and cluster-specific. It also considers a multivariate directed acyclic graph autoregressive structure to accommodate spatial and inter-disease dependence, facilitating the interpretation of spatial correlation. We explore properties of the model by way of simulation studies and show results that prove our proposal compares well to competing alternatives. Finally, we apply the model to the motivating dataset with a twofold goal: clustering areas where the temporal trend of certain diseases are similar, and exploring the potential existence of temporal and/or spatial correlation between two diseases transmitted by the same mosquito.
翻译:全球超过一半的人口面临蚊媒疾病的风险,每年导致数百万病例和数十万人死亡。此类数据分析通常较为复杂,并带来多项具有挑战性的问题,这主要源于广袤的地理区域、独特的时间行为模式以及疾病感染之间的潜在相关性。本研究动机源于热带疾病数据分析,具体涉及2018年至2022年巴西东南部145个微区域中,由同种蚊虫传播的两种虫媒病毒(登革热和基孔肯雅热)的病例数。作为多变量疾病数据研究的贡献,我们开发了一种灵活的贝叶斯多元时空模型,其中时间依赖性针对区域聚类进行定义。该模型为区域数据的随机划分设定了先验分布,该分布融合了邻域信息,从而鼓励形成少数连续聚类的分区,并抑制包含不连通区域的聚类。模型还在疾病特异性和聚类特异性的时间成分中融入了自回归结构及与季节性模式相关的项。同时,模型采用多元有向无环图自回归结构来容纳空间依赖性和跨疾病依赖性,便于解释空间相关性。我们通过模拟研究探讨了模型性质,结果表明本方案与竞争性替代方案相比表现良好。最后,我们将模型应用于本研究的原始数据集,旨在实现双重目标:对某些疾病时间趋势相似的区域进行聚类,并探索由同种蚊虫传播的两种疾病之间可能存在的时空相关性。