Kling-Gupta efficiency ($\mathrm{KGE}$) is a model performance evaluation metric widely used in hydrology, but its properties as a statistical estimator have remained unexplored. We formalize the Kling-Gupta loss $L_\mathrm{KG} = (1 - \mathrm{KGE})^2$ in an extremum estimation framework (maximizing $\mathrm{KGE}$) for multiple linear regression. We give explicit formulas showing that Kling-Gupta regression scales the ordinary least squares (OLS) coefficient vector by a variance-inflation factor depending on sample variances and covariances. Its predictions reproduce the training set response variance, unlike OLS's variance reduction, while both maintain the response mean and achieve the same sample correlation. We prove that no estimator simultaneously maximizes Nash-Sutcliffe efficiency ($\mathrm{NSE}$) and $\mathrm{KGE}$: OLS maximizes $\mathrm{NSE}$ but not $\mathrm{KGE}$, whereas Kling-Gupta regression maximizes $\mathrm{KGE}$ at the expense of $\mathrm{NSE}$. We establish almost-sure convergence of the Kling-Gupta estimator to well-defined population limits. The training and test set performance metrics for both estimators converge asymptotically to identical limits (different for OLS vs. Kling-Gupta). In a single-predictor model with fixed intercept, we identify conditions where a global minimum of $L_\mathrm{KG}$ does not exist because of discontinuity at zero slope. This work establishes a mathematical foundation for $\mathrm{KGE}$-based estimation and clarifies its effects on predictive performance in hydrologic modeling.
翻译:Kling-Gupta效率($\mathrm{KGE}$)是水文学中广泛使用的模型性能评价指标,但其作为统计估计量的性质尚未被探索。我们在极值估计框架(最大化$\mathrm{KGE}$)中将Kling-Gupta损失$L_\mathrm{KG} = (1 - \mathrm{KGE})^2$形式化,用于多元线性回归。我们给出显式公式,证明Kling-Gupta回归通过一个依赖于样本方差和协方差的方差膨胀因子,对普通最小二乘法的系数向量进行缩放。与OLS的方差缩减不同,其预测能重现训练集响应变量的方差,而两者均保持响应均值并达到相同的样本相关系数。我们证明不存在同时最大化纳什-苏特克利夫效率($\mathrm{NSE}$)和$\mathrm{KGE}$的估计量:OLS最大化$\mathrm{NSE}$而非$\mathrm{KGE}$,而Kling-Gupta回归以牺牲$\mathrm{NSE}$为代价最大化$\mathrm{KGE}$。我们建立了Kling-Gupta估计量几乎必然收敛到定义明确的总体极限的性质。两种估计量的训练集和测试集性能指标渐近收敛至相同的极限(OLS与Kling-Gupta的极限不同)。在具有固定截距的单预测变量模型中,我们识别出由于零斜率处的不连续性而导致$L_\mathrm{KG}$全局最小值不存在的情形。本研究为基于$\mathrm{KGE}$的估计奠定了数学基础,并阐明了其对水文建模预测性能的影响。