Due to the increase in data availability in urban and regional studies, various spatial panel models have emerged to model spatial panel data, which exhibit spatial patterns and spatial dependencies between observations across time. Although estimation is usually based on maximum likelihood or generalized method of moments, these methods may fail to yield unique solutions if researchers are faced with high-dimensional settings. This article proposes a model-based gradient boosting algorithm, which enables estimation with interpretable results that is feasible in low- and high-dimensional settings. Due to its modular nature, the flexible model-based gradient boosting algorithm is suitable for a variety of spatial panel models, which can include random and fixed effects. The general framework also enables data-driven model and variable selection as well as implicit regularization where the bias-variance trade-off is controlled for, thereby enhancing accuracy of prediction on out-of-sample spatial panel data. Monte Carlo experiments concerned with the performance of estimation and variable selection confirm proper functionality in low- and high-dimensional settings while real-world applications including non-life insurance in Italian districts, rice production in Indonesian farms and life expectancy in German districts illustrate the potential application.
翻译:随着城市与区域研究中数据可得性的提升,多种空间面板模型应运而生,用于建模具有空间格局且观测值在时间维度上存在空间依赖性的空间面板数据。尽管估计通常基于最大似然法或广义矩估计法,但若研究者面临高维设定,这些方法可能无法得到唯一解。本文提出一种基于模型的梯度提升算法,该算法能够在低维与高维设定下实现具有可解释结果的可行估计。基于其模块化特性,这种灵活的模型梯度提升算法适用于包含随机效应与固定效应的多种空间面板模型。该通用框架还能实现数据驱动的模型与变量选择,以及控制偏差-方差权衡的隐式正则化,从而提升样本外空间面板数据的预测精度。针对估计与变量选择性能的蒙特卡洛实验验证了其在低维与高维设定下的正常功能,而包括意大利地区非寿险业务、印尼农场水稻产量及德国地区预期寿命在内的实际应用则展示了其潜在应用价值。