This paper introduces robust twoblock (RTB) simultaneous dimension reduction, which is the first statistically robust method to perform simultaneous dimension reduction in two blocks of variables and allows to fine-tune the model complexity in each block individually. The paper proposes both a dense and a sparse version of the new method. Sparse RTB is the first robust estimator that allows to select both model complexity and the degree of sparsity for each block individually. RTB thereby allows to optimally extract and summarize the relevant portion of information in each block of data, also in the presence of outliers. As a corollary, the estimators can be recombined into a single estimate of regression coefficients for multivariate regression that is operable when the number of variables exceeds the number of cases in each block. An extensive simulation study illustrates that the new methods are resistant to different types of outliers, while maintaining estimation efficiency. across a range of dimensionality settings. These findings both hold true for the dense and the sparse method. The methods' performance is further illustrated on two example data sets and a straightforward algorithm is presented and made accessible in an open source repository.
翻译:本文提出鲁棒性双块(RTB)同步降维方法,这是首个能够在两个变量块中实现同步降维的统计鲁棒方法,并允许分别微调各块模型复杂度。本文同时提出该新方法的稠密与稀疏版本。稀疏RTB是首个能够分别选择各块模型复杂度与稀疏程度的鲁棒估计器。该方法即使在存在异常值的情况下,也能最优地提取并总结每个数据块中的相关信息。作为推论,这些估计量可重组为多变量回归中回归系数的单一估计量,且该估计量在变量数超过各块样本量时仍可操作。大量模拟研究表明,新方法在不同维度设置下既能抵抗多种异常值干扰,又能保持估计效率——这一结论对稠密与稀疏方法均成立。通过两组示例数据进一步展示方法性能,并给出可直接实现的算法,该算法已开源共享。