The advancement of AI-assisted myopia screening necessitates the joint diagnosis of both-eye (OU) high myopia (HM) status and the prediction of axial length (AL). This clinical requirement introduces a complex mixed-type (binary-continuous) multitask learning task with bi-domain (OU) image covariates, giving rise to two key challenges: i) capture the inter-ocular asymmetry of OU images within a cutting-edge foundation model; ii) model and estimate the conditional dependence structure among mixed-type multivariate responses given image covariates. We address the challenges by: i) imposing residual adapters on the Vision Transformer foundation model to capture the OU similarity and heterogeneity simultaneously; ii) developing a four-dimensional copula loss that is implementable in PyTorch based on a latent variable expression for the Gaussian copula likelihood, and proposing a computationally efficient fast Monte Carlo Expectation Maximization (fMCEM) algorithm to estimate copula parameters. We further formulate a specific overfitting problem called stronger covariance phenomenon in multitask learning. We reveal the disturbance of the phenomenon to estimation of copula parameters and theoretically demonstrate the numerical stability of the proposed fMCEM algorithm against the disturbance. The application to our annotated OU ultra-widefield fundus image dataset and simulation on synthetic data demonstrate that our method stably enhances the predictive capabilities on both classification and regression tasks.
翻译:人工智能辅助近视筛查的进步需要联合诊断双眼高度近视状态并预测眼轴长度。这一临床需求引入了一项复杂的混合类型(二元-连续)多任务学习任务,涉及双领域图像协变量,由此产生两个关键挑战:i)在前沿基础模型中捕捉双眼图像的眼间不对称性;ii)在给定图像协变量的条件下,建模并估计混合类型多变量响应之间的条件依赖结构。我们通过以下方法应对这些挑战:i)在视觉Transformer基础模型上施加残差适配器,以同时捕捉双眼的相似性与异质性;ii)基于高斯Copula似然的潜变量表达式,开发一种可在PyTorch中实现的四维Copula损失函数,并提出一种计算高效的快速蒙特卡洛期望最大化算法来估计Copula参数。我们进一步定义了多任务学习中的一种特定过拟合问题,即强协方差现象。我们揭示了该现象对Copula参数估计的干扰,并从理论上证明了所提出的fMCEM算法在应对该干扰时的数值稳定性。将其应用于我们标注的双眼超广角眼底图像数据集以及合成数据模拟实验的结果表明,我们的方法能够稳定提升分类和回归任务的预测能力。