We show, using three empirical applications, that linear regression estimates which rely on the assumption of sparsity are fragile in two ways. First, we document that different choices of the regressor matrix that don't impact ordinary least squares (OLS) estimates, such as the choice of baseline category with categorical controls, can move sparsity-based estimates two standard errors or more. Second, we develop two tests of the sparsity assumption based on comparing sparsity-based estimators with OLS. The tests tend to reject the sparsity assumption in all three applications. Unless the number of regressors is comparable to or exceeds the sample size, OLS yields more robust results at little efficiency cost.
翻译:我们通过三个实证应用表明,依赖于稀疏性假设的线性回归估计在两方面存在脆弱性。首先,我们记录到,不同的回归变量矩阵选择(例如分类控制变量中基准类别的选取)虽不影响普通最小二乘(OLS)估计,却能使基于稀疏性的估计值移动两个标准误或更多。其次,我们开发了两种基于比较稀疏性估计量与OLS的检验方法。检验结果在所有三个应用中都倾向于拒绝稀疏性假设。除非回归变量数量接近或超过样本量,否则OLS能以极小的效率代价获得更稳健的结果。