Simultaneous Modeling of Disease Screening and Severity Prediction: A Multi-task and Sparse Regularization Approach

Disease prediction is one of the central problems in biostatistical research. Some biomarkers are not only helpful in diagnosing and screening diseases but also associated with the severity of the diseases. It should be helpful to construct a prediction model that can estimate severity at the diagnosis or screening stage from perspectives such as treatment prioritization. We focus on solving the combined tasks of screening and severity prediction, considering a combined response variable such as \{healthy, mild, intermediate, severe\}. This type of response variable is ordinal, but since the two tasks do not necessarily share the same statistical structure, the conventional cumulative logit model (CLM) may not be suitable. To handle the composite ordinal response, we propose the Multi-task Cumulative Logit Model (MtCLM) with structural sparse regularization. This model is sufficiently flexible that can fit the different structures of the two tasks and capture their shared structure of them. In addition, MtCLM is valid as a stochastic model in the entire predictor space, unlike another conventional and flexible model, the non-parallel cumulative logit model (NPCLM). We conduct simulation experiments and real data analysis to illustrate the prediction performance and interpretability.

翻译：疾病预测是生物统计研究的核心问题之一。部分生物标志物不仅有助于疾病的诊断与筛查，还与疾病严重程度相关。从治疗优先级等角度出发，构建能够在诊断或筛查阶段同时评估严重程度的预测模型具有重要价值。本研究聚焦于解决筛查与严重程度预测的联合任务，考虑诸如{健康、轻度、中度、重度}的复合响应变量。此类响应变量具有有序属性，但由于两项任务未必共享相同的统计结构，传统累积对数比模型可能不再适用。为处理复合有序响应，我们提出了具有结构稀疏正则化的多任务累积对数比模型（MtCLM）。该模型具备足够灵活性，既能拟合两项任务的不同结构，又能捕捉其共享结构。此外，与另一种传统灵活模型——非平行累积对数比模型相比，MtCLM在整个预测变量空间中均保持随机模型的有效性。我们通过仿真实验和实际数据分析来展示其预测性能与可解释性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/