In many statistical modeling problems, such as classification and regression, it is common to encounter sparse and blocky coefficients. Sparse fused Lasso is specifically designed to recover these sparse and blocky structured features, especially in cases where the design matrix has ultrahigh dimensions, meaning that the number of features significantly surpasses the number of samples. Quantile loss is a well-known robust loss function that is widely used in statistical modeling. In this paper, we propose a new sparse fused lasso classification model, and develop a unified multi-block linearized alternating direction method of multipliers algorithm that effectively selects sparse and blocky features for regression and classification. Our algorithm has been proven to converge with a derived linear convergence rate. Additionally, our algorithm has a significant advantage over existing methods for solving ultrahigh dimensional sparse fused Lasso regression and classification models due to its lower time complexity. Note that the algorithm can be easily extended to solve various existing fused Lasso models. Finally, we present numerical results for several synthetic and real-world examples, which demonstrate the robustness, scalability, and accuracy of the proposed classification model and algorithm
翻译:在许多统计建模问题(如分类与回归)中,稀疏且分块化的系数结构十分常见。稀疏融合Lasso方法专为恢复此类稀疏分块结构特征而设计,尤其适用于设计矩阵呈超高维情形,即特征数量显著超过样本数量的场景。分位数损失是一种广为人知的稳健损失函数,在统计建模中应用广泛。本文提出了一种新的稀疏融合Lasso分类模型,并开发了统一的多块线性化交替方向乘子法算法,能够有效选择适用于回归与分类任务的稀疏分块特征。我们证明了该算法具有线性收敛速率。此外,由于时间复杂度更低,本算法在求解超高维稀疏融合Lasso回归与分类模型时较现有方法具有显著优势。值得注意的是,该算法可轻松扩展以求解多种现有融合Lasso模型。最后,我们通过合成数据与真实案例的数值实验,验证了所提分类模型与算法在鲁棒性、可扩展性和准确性方面的优越性能。