We propose a variational autoencoder (VAE) approach for parameter estimation in nonlinear mixed-effects models based on ordinary differential equations (NLME-ODEs) using longitudinal data from multiple subjects. In moderate dimensions, likelihood-based inference via the stochastic approximation EM algorithm (SAEM) is widely used, but it relies on Markov Chain Monte-Carlo (MCMC) to approximate subject-specific posteriors. As model complexity increases or observations per subject are sparse and irregular, performance often deteriorates due to a complex, multimodal likelihood surface which may lead to MCMC convergence difficulties. We instead estimate parameters by maximizing the evidence lower bound (ELBO), a regularized surrogate for the marginal likelihood. A VAE with a shared encoder amortizes inference of subject-specific random effects by avoiding per-subject optimization and the use of MCMC. Beyond pointwise estimation, we quantify parameter uncertainty using observed-information-based variance estimator and verify that practical identifiability of the model parameters is not compromised by nuisance parameters introduced in the encoder. We evaluate the method in three simulation case studies (pharmacokinetics, humoral response to vaccination, and TGF-$β$ activation dynamics in asthmatic airways) and on a real-world antibody kinetics dataset, comparing against SAEM baselines.
翻译:我们提出了一种变分自编码器方法,用于基于常微分方程的非线性混合效应模型参数估计,该方法利用来自多个受试者的纵向数据。在中等维度下,通过随机近似EM算法进行的基于似然的推断被广泛使用,但其依赖于马尔可夫链蒙特卡洛方法来近似受试者特定的后验分布。随着模型复杂度的增加或每个受试者的观测数据稀疏且不规则,由于复杂、多峰的似然曲面可能导致MCMC收敛困难,性能往往会下降。我们转而通过最大化证据下界来估计参数,该下界是边际似然的正则化替代。采用共享编码器的VAE通过避免逐受试者优化和MCMC的使用,分摊了受试者特定随机效应的推断。除了点估计外,我们使用基于观测信息的方差估计量量化参数不确定性,并验证模型参数的实际可识别性不会因编码器中引入的冗余参数而受损。我们在三个模拟案例研究(药代动力学、疫苗接种体液反应、哮喘气道中TGF-β激活动力学)和一个真实世界抗体动力学数据集上评估该方法,并与SAEM基线进行比较。