The tree-structured varying coefficient model (TSVC) is a flexible regression approach that allows the effects of covariates to vary with the values of the effect modifiers. Relevant effect modifiers are identified inherently using recursive partitioning techniques. To quantify uncertainty in TSVC models, we propose a procedure to construct confidence intervals of the estimated partition-specific coefficients. This task constitutes a selective inference problem as the coefficients of a TSVC model result from data-driven model building. To account for this issue, we introduce a parametric bootstrap approach, which is tailored to the complex structure of TSVC. Finite sample properties, particularly coverage proportions, of the proposed confidence intervals are evaluated in a simulation study. For illustration, we consider applications to data from COVID-19 patients and from patients suffering from acute odontogenic infection. The proposed approach may also be adapted for constructing confidence intervals for other tree-based methods.
翻译:树结构变系数模型(TSVC)是一种灵活的回归方法,它允许协变量的效应随效应修正因子的取值而变化。相关的效应修正因子通过递归划分技术被自然地识别出来。为了量化TSVC模型中的不确定性,我们提出了一种构建特定划分系数估计值置信区间的方法。由于TSVC模型的系数来源于数据驱动的模型构建过程,该任务构成了一个选择性推断问题。为解决此问题,我们引入了一种参数化自助法,该方法专门针对TSVC的复杂结构而设计。通过模拟研究评估了所提出置信区间的有限样本性质,特别是覆盖率。为作说明,我们考虑了该方法在COVID-19患者数据和急性牙源性感染患者数据中的应用。所提出的方法也可适用于为其他基于树的方法构建置信区间。