The exploration of biomarkers, which are clinically useful biomolecules, and the development of prediction models using them are important problems in biomedical research. Biomarkers are widely used for disease screening, and some are related not only to the presence or absence of a disease but also to its severity. These biomarkers can be useful for prioritization of treatment and clinical decision-making. Considering a model helpful for both disease screening and severity prediction, this paper focuses on regression modeling for an ordinal response equipped with a hierarchical structure. If the response variable is a combination of the presence of disease and severity such as \{{\it healthy, mild, intermediate, severe}\}, for example, the simplest method would be to apply the conventional ordinal regression model. However, the conventional model has flexibility issues and may not be suitable for the problems addressed in this paper, where the levels of the response variable might be heterogeneous. Therefore, this paper proposes a model assuming screening and severity prediction as different tasks, and an estimation method based on structural sparse regularization that leverages any common structure between the tasks when such commonality exists. In numerical experiments, the proposed method demonstrated stable performance across many scenarios compared to existing ordinal regression methods.
翻译:生物标志物作为具有临床实用性的生物分子,其探索与基于此类标志物的预测模型开发是生物医学研究中的重要课题。生物标志物广泛应用于疾病筛查,其中某些标志物不仅与疾病是否存在相关,还与疾病的严重程度相关联。这些标志物对于治疗优先级排序和临床决策制定具有重要价值。本文着眼于构建一个同时有助于疾病筛查和严重程度预测的模型,重点研究具有层次结构的序数响应变量的回归建模问题。若响应变量为疾病存在状态与严重程度的组合(例如\{健康、轻度、中度、重度\}),最简单的方法是应用传统的序数回归模型。然而,传统模型存在灵活性不足的问题,可能不适用于本文所探讨的响应变量水平可能存在异质性的场景。为此,本文提出将筛查与严重程度预测视为不同任务的建模框架,并基于结构稀疏正则化提出一种估计方法——当任务间存在共性结构时,该方法能够有效利用这种共同特征。数值实验表明,与现有序数回归方法相比,所提方法在多种情境下均表现出稳定的性能。