Irregular errors such as heteroscedasticity and nonnormality remain major challenges in linear modeling. These issues often lead to biased inference and unreliable measures of uncertainty. Traditional remedies, such as log transformations, robust standard errors, or weighted least squares, only partially address the problem and may fail when heteroscedasticity interacts with skewness or nonlinear mean patterns. To address this, we propose a two-stage cumulative distribution function-based beta regression framework. The response is first transformed using an empirical distribution function and modeled with a flexible beta distribution, then mapped back to the original scale via the empirical quantile function. Because the beta distribution links variance directly to its mean and precision, heteroscedasticity and nonnormality are handled naturally, without requiring ad hoc variance assumptions or weighting schemes. A comprehensive Monte Carlo simulation study evaluates the proposed method against other methods such as weighted least squares. The results show that the cumulative distribution function-based beta method outperforms traditional competitors. By directly modeling the full conditional distribution, it offers reliable inference, calibrated prediction even under extreme assumption violations, and meaningful interpretation of effects through percentile shifts.
翻译:异方差性和非正态性等不规则误差仍是线性建模中的主要挑战。这些问题常导致推断偏差和不确定性度量不可靠。传统解决方法(如对数变换、稳健标准误或加权最小二乘法)仅能部分解决问题,且当异方差性与偏度或非线性均值模式相互作用时可能失效。为此,我们提出一种基于累积分布函数的两阶段贝塔回归框架:首先使用经验分布函数对响应变量进行变换,并通过灵活贝塔分布建模,随后通过经验分位数函数映射回原始尺度。由于贝塔分布将方差直接与其均值和精度参数关联,异方差性和非正态性能被自然处理,无需特殊方差假设或加权方案。通过全面的蒙特卡洛模拟研究,将所提方法与加权最小二乘法等方法进行比较。结果表明,基于累积分布函数的贝塔方法优于传统方法。通过对完整条件分布的直接建模,该方法能提供可靠的统计推断、即使在极端假设违反情况下仍具有校准预测能力,并能通过百分位数偏移对效应进行有意义的解释。