Accurate measurement of eyelid parameters such as Margin Reflex Distances (MRD1, MRD2) and Levator Function (LF) is critical in oculoplastic diagnostics but remains limited by manual, inconsistent methods. This study evaluates deep learning models: SE-ResNet, EfficientNet, and the vision transformer-based DINOv2 for automating these measurements using smartphone-acquired images. We assess performance across frozen and fine-tuned settings, using MSE, MAE, and R2 metrics. DINOv2, pretrained through self-supervised learning, demonstrates superior scalability and robustness, especially under frozen conditions ideal for mobile deployment. Lightweight regressors such as MLP and Deep Ensemble offer high precision with minimal computational overhead. To address class imbalance and improve generalization, we integrate focal loss, orthogonal regularization, and binary encoding strategies. Our results show that DINOv2 combined with these enhancements delivers consistent, accurate predictions across all tasks, making it a strong candidate for real-world, mobile-friendly clinical applications. This work highlights the potential of foundation models in advancing AI-powered ophthalmic care.
翻译:眼睑参数(如上睑缘反射距离MRD1、MRD2及提上睑肌功能LF)的精确测量在眼整形诊断中至关重要,但目前仍受限于手动操作且结果不一致的传统方法。本研究评估了SE-ResNet、EfficientNet以及基于视觉Transformer的DINOv2等深度学习模型,利用智能手机采集的图像实现上述参数的自动化测量。我们在冻结参数与微调参数两种设置下,采用均方误差(MSE)、平均绝对误差(MAE)和决定系数(R2)评估模型性能。通过自监督学习预训练的DINOv2模型展现出卓越的可扩展性与鲁棒性,尤其在适用于移动端部署的冻结参数条件下表现突出。轻量级回归器(如多层感知机MLP与深度集成方法)能以极低计算开销实现高精度预测。为应对类别不平衡问题并提升泛化能力,我们整合了焦点损失函数、正交正则化策略及二进制编码方法。实验结果表明,结合上述增强技术的DINOv2模型在所有任务中均能提供稳定而准确的预测,使其成为现实世界移动端临床应用的理想候选方案。本研究揭示了基础模型在推动人工智能赋能眼科诊疗领域的巨大潜力。