The effect of collinearity and sample size on linear regression results: a simulation study

Background: Multicollinearity inflates the variance of OLS coefficients, widening confidence intervals and reducing inferential reliability. Yet fixed variance inflation factor (VIF) cut-offs are often applied uniformly across studies with very different sample sizes, even though collinearity is a finite-sample problem. We quantify how collinearity and sample size jointly affect linear regression performance and provide practical guidance for interpreting VIFs. Methods: We simulated data across sample sizes N=100-100,000 and collinearity levels VIF=1-50. For each scenario we generated 1,000 datasets, fitted OLS models, and assessed coverage, mean absolute error (MAE), bias, traditional power (CI excludes 0), and precision assurance (probability the 95% CI lies within a prespecified margin around the true effect). We also evaluated a biased, misspecified setting by omitting a relevant predictor to study bias amplification. Results: Under correct specification, collinearity did not materially affect nominal coverage and did not introduce systematic bias, but it reduced precision in small samples: at N=100, even mild collinearity (VIF<2) inflated MAE and markedly reduced both power metrics, whereas at N>=50,000 estimates were robust even at VIF=50. Under misspecification, collinearity strongly amplified bias, increasing errors, reducing coverage, and sharply degrading both precision assurance and traditional power even at low VIF. Conclusion: VIF thresholds should not be applied mechanically. Collinearity must be interpreted in relation to sample size and potential sources of bias; removing predictors solely to reduce VIF can worsen inference via omitted-variable bias. The accompanying heatmaps provide a practical reference across study sizes and modelling assumptions.

翻译：背景：多重共线性会增大普通最小二乘（OLS）系数的方差，从而扩大置信区间并降低推断的可靠性。然而，尽管共线性是一个有限样本问题，固定的方差膨胀因子（VIF）截断值却常被统一应用于样本量差异巨大的不同研究。我们量化了共线性和样本量如何共同影响线性回归性能，并为解释VIF提供了实用指导。方法：我们在样本量N=100-100,000和共线性水平VIF=1-50的范围内模拟数据。针对每种情境，我们生成了1,000个数据集，拟合OLS模型，并评估了覆盖率、平均绝对误差（MAE）、偏差、传统功效（置信区间排除0）以及精度保证（95%置信区间落在真实效应周围预设范围内的概率）。我们还通过省略一个相关预测变量来评估一个有偏、误设的情境，以研究偏差放大效应。结果：在模型设定正确的情况下，共线性未对名义覆盖率产生实质性影响，也未引入系统性偏差，但它在小样本中降低了精度：当N=100时，即使轻度共线性（VIF<2）也会增大MAE并显著降低两种功效指标；而当N>=50,000时，即使VIF=50，估计结果依然稳健。在模型误设的情况下，共线性强烈放大了偏差，增加了误差，降低了覆盖率，并即使在低VIF水平下也急剧劣化了精度保证和传统功效。结论：不应机械地应用VIF阈值。必须结合样本量和潜在的偏差来源来解释共线性；仅为了降低VIF而剔除预测变量可能因遗漏变量偏差而恶化推断效果。随附的热图为不同研究规模和建模假设提供了实用的参考。