Contrasting Global and Patient-Specific Regression Models via a Neural Network Representation

When developing clinical prediction models, it can be challenging to balance between global models that are valid for all patients and personalized models tailored to individuals or potentially unknown subgroups. To aid such decisions, we propose a diagnostic tool for contrasting global regression models and patient-specific (local) regression models. The core utility of this tool is to identify where and for whom a global model may be inadequate. We focus on regression models and specifically suggest a localized regression approach that identifies regions in the predictor space where patients are not well represented by the global model. As localization becomes challenging when dealing with many predictors, we propose modeling in a dimension-reduced latent representation obtained from an autoencoder. Using such a neural network architecture for dimension reduction enables learning a latent representation simultaneously optimized for both good data reconstruction and for revealing local outcome-related associations suitable for robust localized regression. We illustrate the proposed approach with a clinical study involving patients with chronic obstructive pulmonary disease. Our findings indicate that the global model is adequate for most patients but that indeed specific subgroups benefit from personalized models. We also demonstrate how to map these subgroup models back to the original predictors, providing insight into why the global model falls short for these groups. Thus, the principal application and diagnostic yield of our tool is the identification and characterization of patients or subgroups whose outcome associations deviate from the global model.

翻译：在开发临床预测模型时，如何在适用于所有患者的全局模型与针对个体或潜在未知亚组定制的个性化模型之间取得平衡，往往面临挑战。为辅助此类决策，我们提出了一种用于对比全局回归模型与患者特异性（局部）回归模型的诊断工具。该工具的核心功能在于识别全局模型可能不适用的区域及适用人群。我们聚焦于回归模型，特别提出了一种局部化回归方法，用于识别预测变量空间中患者未被全局模型充分表征的区域。当处理多预测变量时，局部化建模面临挑战，为此我们建议在通过自编码器获得的降维潜在表征空间中进行建模。采用此类神经网络架构进行降维，能够同时学习到既满足数据重构需求，又能揭示适用于稳健局部化回归的局部结局相关关联的潜在表征。我们通过一项涉及慢性阻塞性肺疾病患者的临床研究阐明了所提出的方法。研究结果表明，全局模型适用于大多数患者，但特定亚组确实受益于个性化模型。我们还展示了如何将这些亚组模型映射回原始预测变量，从而深入理解全局模型为何在这些群体中表现不足。因此，我们工具的主要应用和诊断价值在于识别和表征那些结局关联与全局模型存在偏差的患者或亚组。