Neural networks have become a popular tool in predictive modelling, more commonly associated with machine learning and artificial intelligence than with statistics. Generalised Additive Models (GAMs) are flexible non-linear statistical models that retain interpretability. Both are state-of-the-art in their own right, with their respective advantages and disadvantages. This paper analyses how these two model classes have performed on real-world tabular data. Following PRISMA guidelines, we conducted a systematic review of papers that performed empirical comparisons of GAMs and neural networks. Eligible papers were identified, yielding 143 papers, with 430 datasets. Key attributes at both paper and dataset levels were extracted and reported. Beyond summarising comparisons, we analyse reported performance metrics using mixed-effects modelling to investigate potential characteristics that can explain and quantify observed differences, including application area, study year, sample size, number of predictors, and neural network complexity. Across datasets, no consistent evidence of superiority was found for either GAMs or neural networks when considering the most frequently reported metrics (RMSE, $R^2$, and AUC). Neural networks tended to outperform in larger datasets and in those with more predictors, but this advantage narrowed over time. Conversely, GAMs remained competitive, particularly in smaller data settings, while retaining interpretability. Reporting of dataset characteristics and neural network complexity was incomplete in much of the literature, limiting transparency and reproducibility. This review highlights that GAMs and neural networks should be viewed as complementary approaches rather than competitors. For many tabular applications, the performance trade-off is modest, and interpretability may favour GAMs.
翻译:神经网络已成为预测建模中的一种流行工具,通常更多地与机器学习和人工智能相关,而非统计学。广义加性模型(GAMs)是灵活的非线性统计模型,同时保持了可解释性。这两类模型各自代表了当前最先进的技术,各有其优缺点。本文分析了这两类模型在现实世界表格数据上的表现。遵循PRISMA指南,我们对进行了GAMs与神经网络实证比较的论文进行了系统性综述。通过筛选符合条件的论文,共获得143篇论文,涉及430个数据集。我们提取并报告了论文层面和数据集层面的关键属性。除了总结比较结果外,我们还使用混合效应模型分析了报告的性能指标,以探究可能解释和量化观察到的差异的潜在特征,包括应用领域、研究年份、样本量、预测变量数量以及神经网络复杂度。在所有数据集中,考虑到最常报告的指标(RMSE、$R^2$和AUC),未发现GAMs或神经网络具有一致性的优越性。神经网络在较大数据集和预测变量较多的数据集中往往表现更优,但这种优势随时间推移而减弱。相反,GAMs仍保持竞争力,尤其是在数据量较小的场景中,同时保留了可解释性。大量文献中对数据集特征和神经网络复杂度的报告不完整,限制了透明度和可重复性。本综述强调,应将GAMs和神经网络视为互补方法而非竞争对手。对于许多表格数据应用,性能差异较小,而可解释性可能更倾向于GAMs。