While extensive research has been conducted on high-dimensional data and on regression with left-censored responses, simultaneously addressing these complexities remains challenging, with only a few proposed methods available. In this paper, we utilize the Iterative Hard Thresholding (IHT) algorithm on the Tobit model in such a setting. Theoretical analysis demonstrates that our estimator converges with a near-optimal minimax rate. Additionally, we extend the method to a distributed setting, requiring only a few rounds of communication while retaining the estimation rate of the centralized version. Simulation results show that the IHT algorithm for the Tobit model achieves superior accuracy in predictions and subset selection, with the distributed estimator closely matching that of the centralized estimator. When applied to high-dimensional left-censored HIV viral load data, our method also exhibits similar superiority.
翻译:尽管对高维数据及左删失响应回归问题已有广泛研究,但同时应对这些复杂性仍具挑战性,现有方法寥寥可数。本文在该场景下将迭代硬阈值(IHT)算法应用于Tobit模型。理论分析表明,本估计量以近乎最优的极小化收敛速率实现收敛。此外,我们将该方法扩展至分布式环境,仅需少量通信轮次即可保持与集中式版本相当的估计速率。仿真结果显示,应用于Tobit模型的IHT算法在预测准确性与子集选择方面表现卓越,且分布式估计量与集中式估计量高度吻合。当应用于高维左删失HIV病毒载量数据时,本方法同样展现出类似优势。