ssROC: Semi-Supervised ROC Analysis for Reliable and Streamlined Evaluation of Phenotyping Algorithms

Objective: High-throughput phenotyping will accelerate the use of electronic health records (EHRs) for translational research. A critical roadblock is the extensive medical supervision required for phenotyping algorithm (PA) estimation and evaluation. To address this challenge, numerous weakly-supervised learning methods have been proposed. However, there is a paucity of methods for reliably evaluating the predictive performance of PAs when a very small proportion of the data is labeled. To fill this gap, we introduce a semi-supervised approach (ssROC) for estimation of the receiver operating characteristic (ROC) parameters of PAs (e.g., sensitivity, specificity). Materials and Methods: ssROC uses a small labeled dataset to nonparametrically impute missing labels. The imputations are then used for ROC parameter estimation to yield more precise estimates of PA performance relative to classical supervised ROC analysis (supROC) using only labeled data. We evaluated ssROC with synthetic, semi-synthetic, and EHR data from Mass General Brigham (MGB). Results: ssROC produced ROC parameter estimates with minimal bias and significantly lower variance than supROC in the simulated and semi-synthetic data. For the five PAs from MGB, the estimates from ssROC are 30% to 60% less variable than supROC on average. Discussion: ssROC enables precise evaluation of PA performance without demanding large volumes of labeled data. ssROC is also easily implementable in open-source R software. Conclusion: When used in conjunction with weakly-supervised PAs, ssROC facilitates the reliable and streamlined phenotyping necessary for EHR-based research.

翻译：目的：高通量表型分析将加速电子健康档案（EHRs）在转化研究中的应用。一个关键障碍是表型算法（PA）的评估与估计需要大量医学监督。为解决这一挑战，研究者提出了众多弱监督学习方法。然而，当仅有一小部分数据被标注时，目前尚缺乏可靠评估PA预测性能的方法。为填补这一空白，我们提出一种半监督方法（ssROC），用于估计PA的受试者工作特征（ROC）参数（如灵敏度、特异度）。材料与方法：ssROC利用少量标注数据集通过非参数方法插补缺失标签。随后，这些插补值被用于ROC参数估计，相比仅使用标注数据的经典监督ROC分析（supROC），能获得更精确的PA性能估计值。我们在合成数据、半合成数据以及来自麻省总院布莱根医疗中心（MGB）的EHR数据上评估了ssROC。结果：在模拟数据和半合成数据中，ssROC生成的ROC参数估计偏差极小，且方差显著低于supROC。对于来自MGB的五种PA，ssROC的估计值平均变异程度比supROC低30%至60%。讨论：ssROC无需大量标注数据即可实现PA性能的精确评估。此外，ssROC易于在开源R软件中实现。结论：当与弱监督PA结合使用时，ssROC能够促进基于EHR研究所需的可靠且简化的表型分析。