The scalar-on-image regression model examines the association between a scalar response and a bivariate function (e.g., images) through the estimation of a bivariate coefficient function. Existing approaches often impose smoothness constraints to control the bias-variance trade-off, and thus prevent overfitting. However, such assumptions can hinder interpretability, especially when only certain regions of an image influence changes in the response. In such a scenario, interpretability can be better captured by imposing sparsity assumptions on the coefficient function. To address this challenge, we propose the Generalized Dantzig Selector, a novel method that jointly enforces sparsity and smoothness on the coefficient function. The proposed approach enhances interpretability by accurately identifying regions with no contribution to the changes of response, while preserving stability in estimation. Extensive simulation studies and real data applications demonstrate that the new method is highly interpretable and achieves notable improvements over existing approaches. Moreover, we rigorously establish non-asymptotic bounds for the estimation error, providing strong theoretical guarantees for the proposed framework.
翻译:标量对图像回归模型通过估计双变量系数函数来研究标量响应与双变量函数(如图像)之间的关联。现有方法通常施加光滑性约束以控制偏差-方差权衡,从而防止过拟合。然而,此类假设可能阻碍模型可解释性,尤其当只有图像中的特定区域影响响应变化时。在这种情况下,对系数函数施加稀疏性假设能更好地捕获可解释性。为应对这一挑战,我们提出广义丹齐克选择器(Generalized Dantzig Selector),这是一种在系数函数上联合施加稀疏性与光滑性的新颖方法。所提方法通过准确识别对响应变化无贡献的区域提升可解释性,同时保持估计的稳定性。大量模拟研究与实际数据应用表明,新方法具有高度可解释性,且在现有方法基础上实现了显著改进。此外,我们严谨地建立了估计误差的非渐近界,为所提框架提供了坚实的理论保障。