With an increasing focus on precision medicine in medical research, numerous studies have been conducted in recent years to clarify the relationship between treatment effects and patient characteristics. The treatment effects for patients with different characteristics are always heterogeneous, and various heterogeneous treatment effect machine learning estimation methods have been proposed owing to their flexibility and high prediction accuracy. However, most machine learning methods rely on black-box models, preventing direct interpretation of the relationship between patient characteristics and treatment effects. Moreover, most of these studies have focused on continuous or binary outcomes, although survival outcomes are also important in medical research. To address these challenges, we propose a heterogeneous treatment effect estimation method for survival data based on RuleFit, an interpretable machine learning method. Numerical simulation results confirmed that the prediction performance of the proposed method was comparable to that of existing methods. We also applied a dataset from an HIV study, the AIDS Clinical Trials Group Protocol 175 dataset, to illustrate the interpretability of the proposed method using real data. Consequently, the proposed method established an interpretable model with sufficient prediction accuracy.
翻译:随着精准医疗在医学研究中受到日益关注,近年来大量研究致力于阐明疗效与患者特征之间的关系。不同特征患者的疗效往往具有异质性,基于灵活性和高预测精度,多种异质性处理效应机器学习估计方法相继被提出。然而,大多数机器学习方法依赖黑箱模型,无法直接解释患者特征与疗效之间的关系。此外,尽管生存结局在医学研究中同样重要,但相关研究多聚焦于连续型或二元结局变量。为应对这些挑战,本文基于可解释机器学习方法RuleFit,提出了一种适用于生存数据的异质性处理效应估计方法。数值模拟结果表明,所提方法的预测性能与现有方法相当。我们进一步将艾滋病临床试验小组方案175数据集(一项HIV研究)应用于实际数据分析,验证了所提方法的可解释性。最终,该方法构建了兼具充分预测精度与可解释性的模型。