Bayesian high-dimensional covariate selection in non-linear mixed-effects models using the SAEM algorithm

High-dimensional variable selection, with many more covariates than observations, is widely documented in standard regression models, but there are still few tools to address it in non-linear mixed-effects models where data are collected repeatedly on several individuals. In this work, variable selection is approached from a Bayesian perspective and a selection procedure is proposed, combining the use of a spike-and-slab prior and the SAEM algorithm. Similarly to Lasso regression, the set of relevant covariates is selected by exploring a grid of values for the penalisation parameter. The SAEM approach is much faster than a classical MCMC algorithm and our method shows very good selection performances on simulated data. Its flexibility is demonstrated by implementing it for a variety of nonlinear mixed effects models. The usefulness of the proposed method is illustrated on a problem of genetic markers identification, relevant for genomic-assisted selection in plant breeding.

翻译：高维变量选择（协变量数量远多于观测数量）在标准回归模型中已有广泛研究，但在非线性混合效应模型（其中数据在多个个体上重复收集）中仍缺乏相关工具。本研究从贝叶斯视角处理变量选择问题，提出一种结合spike-and-slab先验与SAEM算法的选择程序。与Lasso回归类似，通过遍历惩罚参数网格值选取相关协变量集。SAEM方法比经典MCMC算法计算速度更快，且该方法在模拟数据上展现出优异的变量选择性能。通过在多种非线性混合效应模型中的实现，验证了其灵活性。最后在植物育种基因组辅助选择相关的遗传标记识别问题中展示了该方法的实用性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

73+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日