The paper revisits the $α$--regression framework for compositional data. The model uses a flexible power transformation parameterized by $α$ to interpolate between raw data analysis and log--ratio methods, naturally handling zeros without imputation while allowing data--driven transformation selection. We formulate $α$--regression as a non--linear least squares problem, study its asymptotic properties, provide efficient estimation via the Levenberg--Marquardt algorithm, derive marginal effects for interpretation, and provide a visual inspection of the effect of each predictor. We further discuss robustified versions, the inclusion of natural splines, and the incorporation of compositional predictors which further facilitate the formulation of a simple time series model. The framework is extended to spatial settings through four models. a) The $α$--spatially--lagged X regression model, which incorporates spatial spillover effects via spatially--lagged covariates, with decomposition into direct and indirect effects. b) The $α$--spatial autoregressive model that allows for spatial autocorrelation. c) The geographically--weighted $α$--regression, which allows coefficients to vary spatially for capturing local relationships. d) The $α$--eigenvector spatial filtering that is computationally efficient and captures spatial dependence via the eigenvectors of the kernelized distance matrix. Applications to four real datasets illustrate that the models perform on par with or outperform existing models in the literature. The examples showcase that spatial extensions capture the dependence and improve the predictive performance. Overall, the examples provide evidence that the log--ratio methodology does not lead to the optimal results.
翻译:本文重新审视了成分数据的$α$回归框架。该模型采用参数化为$α$的灵活幂变换,在原始数据分析和对数比率方法之间进行插值,无需插值即可自然处理零值,同时允许数据驱动的变换选择。我们将$α$回归构建为非线性最小二乘问题,研究其渐近性质,通过莱文贝格-马夸特算法提供高效估计,推导边际效应以进行解释,并提供每个预测因子影响的可视化检验。进一步讨论了鲁棒化版本、自然样条的引入以及成分预测因子的纳入,这进一步促进了简单时间序列模型的构建。该框架通过四种模型扩展到空间设置:a) $α$空间滞后X回归模型,通过空间滞后协变量引入空间溢出效应,并分解为直接和间接效应;b) $α$空间自回归模型,允许空间自相关;c) 地理加权$α$回归,允许系数随空间变化以捕捉局部关系;d) $α$特征向量空间滤波,该模型计算效率高,并通过核化距离矩阵的特征向量捕捉空间依赖性。对四个真实数据集的实例表明,这些模型的表现与现有文献中的模型相当甚至更优。实例证明空间扩展可捕获依赖性并提升预测性能。总体而言,实例提供了对数比率方法并未产生最优结果的证据。