Accurate estimation of cause-specific mortality fractions (CSMFs), the percentage of deaths attributable to each cause in a population, is essential for global health monitoring. Challenge arises because computer-coded verbal autopsy (CCVA) algorithms, commonly used to estimate CSMFs, frequently misclassify the cause of death (COD). This misclassification is further complicated by structured patterns and substantial variation across countries. To address this, we introduce the R package 'vacalibration'. It implements a modular Bayesian framework to correct for the misclassification, thereby yielding more accurate CSMF estimates from verbal autopsy (VA) questionnaire data. The package utilizes uncertainty-quantified CCVA misclassification matrix estimates derived from data collected in the CHAMPS project and available on the 'CCVA-Misclassification-Matrices' GitHub repository. Currently, these matrices cover three CCVA algorithms (EAVA, InSilicoVA, and InterVA) and two age groups (neonates aged 0-27 days, and children aged 1-59 months) across countries (specific estimates for Bangladesh, Ethiopia, Kenya, Mali, Mozambique, Sierra Leone, and South Africa, and a combined estimate for all other countries), enabling global calibration. The 'vacalibration' package also supports ensemble calibration when multiple algorithms are available. Implemented using the 'RStan', the package offers rapid computation, uncertainty quantification, and seamless compatibility with openVA, a leading COD analysis software ecosystem. We demonstrate the package's flexibility with two real-world applications in COMSA-Mozambique and CA CODE. The package and its foundational methodology applies more broadly and can calibrate any discrete classifier or their ensemble.
翻译:准确估计死因特异性死亡率分数(CSMFs),即人群中每种死因导致的死亡百分比,对全球健康监测至关重要。由于计算机编码的死因推断(CCVA)算法(常用于估计CSMFs)频繁误分类死因(COD),这带来了挑战。这种误分类因结构性模式和各国间的显著差异而进一步复杂化。为解决此问题,我们引入R包‘vacalibration’。它实现了一个模块化贝叶斯框架来校正误分类,从而从死因推断(VA)问卷数据中得出更准确的CSMF估计。该包利用来自CHAMPS项目收集的数据(可在‘CCVA-Misclassification-Matrices’GitHub仓库获取)的不确定性量化CCVA误分类矩阵估计。目前,这些矩阵涵盖三种CCVA算法(EAVA、InSilicoVA和InterVA)和两个年龄组(0-27天新生儿和1-59个月儿童),覆盖多国(针对孟加拉国、埃塞俄比亚、肯尼亚、马里、莫桑比克、塞拉利昂和南非的特定估计,以及所有其他国家的联合估计),实现全球校准。‘vacalibration’包还支持在多种算法可用时进行集成校准。通过使用‘RStan’实现,该包提供快速计算、不确定性量化,并与领先的COD分析软件生态系统openVA无缝兼容。我们通过COMSA-莫桑比克和CA CODE的两个实际应用展示了该包的灵活性。该包及其基础方法具有更广泛的适用性,可校准任何离散分类器或其集成。