In biomedical and public health association studies, binary outcome variables may be subject to misclassification, resulting in substantial bias in effect estimates. The feasibility of addressing binary outcome misclassification in regression models is often hindered by model identifiability issues. In this paper, we characterize the identifiability problems in this class of models as a specific case of ''label switching'' and leverage a pattern in the resulting parameter estimates to solve the permutation invariance of the complete data log-likelihood. Our proposed algorithm in binary outcome misclassification models does not require gold standard labels and relies only on the assumption that the sum of the sensitivity and specificity exceeds 1. A label switching correction is applied within estimation methods to recover unbiased effect estimates and to estimate misclassification rates. Open source software is provided to implement the proposed methods. We give a detailed simulation study for our proposed methodology and apply these methods to data from the 2020 Medical Expenditure Panel Survey (MEPS).
翻译:在生物医学和公共卫生关联研究中,二分类结局变量可能受到错分影响,导致效应估计出现显著偏倚。回归模型中处理二分类结局错分的可行性往往受限于模型可识别性问题。本文将该类模型中的可识别性问题特征化为"标签交换"的特例,并利用由此产生的参数估计模式来解决完全数据对数似然的排列不变性。所提出的二分类结局错分模型算法无需金标准标签,仅依赖于灵敏度和特异度之和大于1的假设。在估计方法中应用标签交换修正以恢复无偏效应估计并估计错分率。我们提供了开源软件以实施所提方法,对所提方法进行了详细的模拟研究,并将其应用于2020年美国医疗支出面板调查(MEPS)数据。