This work demonstrates that applying a fixed-effect multiple linear regression (MLR) model to an overparameterized dataset is mathematically equivalent to fitting a hyper-curve parameterized by a single scalar. This reformulation shifts the focus from global coefficients to individual predictors, allowing each to be modeled as a function of a common parameter. We prove that this overparameterized linear framework can yield exact predictions even when the underlying data contains nonlinear dependencies that violate classical linear assumptions. By employing parameterization in terms of the dependent variable and a monomial basis, we validate this approach on both synthetic and experimental datasets. Our results show that the hyper-curve perspective provides a robust framework for regularizing problems with noisy predictors and offers a systematic method for identifying and removing 'improper' predictors that degrade model generalizability.
翻译:本研究表明,将固定效应多元线性回归(MLR)模型应用于过参数化数据集在数学上等价于拟合一个由单个标量参数化的超曲线。这一重构将关注点从全局系数转移到个体预测变量,使得每个预测变量可被建模为一个公共参数的函数。我们证明,即使基础数据包含违反经典线性假设的非线性依赖关系,这种过参数化线性框架仍能产生精确预测。通过采用以因变量和单项式基为参数的参数化方法,我们在合成数据集和实验数据集上验证了该方法的有效性。结果表明,超曲线视角为含有噪声预测变量的问题提供了稳健的正则化框架,并为识别和移除降低模型泛化能力的“不适当”预测变量提供了一种系统化方法。