The paper revisits the $α$--regression framework for compositional data. The model uses a flexible power transformation parameterized by $α$ to interpolate between raw data analysis and log--ratio methods, naturally handling zeros without imputation while allowing data--driven transformation selection. We formulate $α$--regression as a non--linear least squares problem, study its asymptotic properties, provide efficient estimation via the Levenberg--Marquardt algorithm, derive marginal effects for interpretation, and provide a visual inspection of the effect of each predictor. We further discuss robustified versions, the inclusion of natural splines, and the incorporation of compositional predictors which further facilitate the formulation of a simple time series model. The framework is extended to spatial settings through four models. a) The $α$--spatially--lagged X regression model, which incorporates spatial spillover effects via spatially--lagged covariates, with decomposition into direct and indirect effects. b) The $α$--spatial autoregressive model that allows for spatial autocorrelation. c) The geographically--weighted $α$--regression, which allows coefficients to vary spatially for capturing local relationships. d) The $α$--eigenvector spatial filtering that is computationally efficient and captures spatial dependence via the eigenvectors of the kernelized distance matrix. Applications to four real datasets illustrate that the models perform on par with or outperform existing models in the literature. The examples showcase that spatial extensions capture the dependence and improve the predictive performance. Overall, the examples provide evidence that the log--ratio methodology does not lead to the optimal results.
翻译:本文重新审视了成分数据的 $\alpha$ 回归框架。该模型采用由 $\alpha$ 参数化的灵活幂变换,可在原始数据分析与对数比方法之间插值,无需插值即可自然处理零值,同时实现数据驱动的变换选择。我们将 $\alpha$ 回归构建为非线性最小二乘问题,研究其渐近性质,通过Levenberg-Marquardt算法提供高效估计,推导边际效应以辅助解释,并提供各预测变量效应的可视化检验方法。进一步讨论了鲁棒化版本、自然样条的引入以及成分型预测变量的纳入,这有助于构建简洁的时间序列模型。该框架通过四种空间回归模型进行扩展:a) $\alpha$ 空间滞后X回归模型,通过空间滞后协变量纳入空间溢出效应,并分解为直接效应与间接效应;b) $\alpha$ 空间自回归模型,允许空间自相关;c) 地理加权 $\alpha$ 回归,允许系数随空间变化以捕捉局部关系;d) $\alpha$ 特征向量空间滤波,该方法计算高效,通过核化距离矩阵的特征向量捕捉空间依赖性。在四个实际数据集上的应用表明,该模型性能与现有文献模型相当或更优。案例分析显示,空间扩展能够有效捕捉空间依赖性并提升预测性能。综合而言,案例证据表明对数比方法并非最优方案。