We show, using three empirical applications, that linear regression estimates predicated on the assumption of sparsity are fragile in two ways. First, we document that different choices of the regressor matrix which do not impact ordinary least squares (OLS) estimates, such as the choice of baseline category with categorical controls, can move sparsity-based estimates by two standard errors or more. Second, we develop two tests of the sparsity assumption by comparing sparsity-based estimators with OLS. The tests tend to reject the sparsity assumption in all three applications. Unless the number of regressors is comparable to or exceeds the sample size, OLS yields more robust inference at little efficiency cost.
翻译:我们通过三项实证应用表明,基于稀疏性假设的线性回归估计在两个方面存在脆弱性。首先,我们记录到,回归矩阵的不同选择(例如分类控制变量中基准类别的选择)在不会影响普通最小二乘(OLS)估计的情况下,却可能使基于稀疏性的估计值偏移两个标准误或更多。其次,我们通过将基于稀疏性的估计量与OLS进行比较,开发出两种对稀疏性假设的检验方法。这些检验在所有三项应用中均倾向于拒绝稀疏性假设。除非回归变量数量与样本量相当或超过样本量,否则OLS能以极小的效率代价提供更稳健的推断。