Generalized linear models (GLMs) are popular for data-analysis in almost all quantitative sciences, but the choice of likelihood family and link function is often difficult. This motivates the search for likelihoods and links that minimize the impact of potential misspecification. We perform a large-scale simulation study on double-bounded and lower-bounded response data where we systematically vary both true and assumed likelihoods and links. In contrast to previous studies, we also study posterior calibration and uncertainty metrics in addition to point-estimate accuracy. Our results indicate that certain likelihoods and links can be remarkably robust to misspecification, performing almost on par with their respective true counterparts. Additionally, normal likelihood models with identity link (i.e., linear regression) often achieve calibration comparable to the more structurally faithful alternatives, at least in the studied scenarios. On the basis of our findings, we provide practical suggestions for robust likelihood and link choices in GLMs.
翻译:广义线性模型(GLMs)在几乎所有定量科学领域的数据分析中广受欢迎,但似然函数族和连接函数的选择往往颇具难度。这促使人们寻找能够最大限度降低潜在错定影响的似然函数和连接函数。我们对双有界和下界响应数据开展了一项大规模模拟研究,系统性地改变了真实模型和假设模型的似然函数与连接函数。与既往研究不同,我们还研究了后验校准和不确定性度量(除点估计准确性之外)。结果表明,某些似然函数和连接函数对错定具有显著的鲁棒性,其表现几乎与相应的真实模型相当。此外,在至少所研究的场景中,采用恒等连接的正态似然模型(即线性回归)通常能实现与结构更忠实的替代模型相当的校准性能。基于这些发现,我们为GLMs中鲁棒的似然函数和连接函数选择提供了实用建议。