Flexible laryngoscopy is commonly performed by otolaryngologists to detect laryngeal diseases and to recognize potentially malignant lesions. Recently, researchers have introduced machine learning techniques to facilitate automated diagnosis using laryngeal images and achieved promising results. Diagnostic performance can be improved when patients' demographic information is incorporated into models. However, manual entry of patient data is time consuming for clinicians. In this study, we made the first endeavor to employ deep learning models to predict patient demographic information to improve detector model performance. The overall accuracy for gender, smoking history, and age was 85.5%, 65.2%, and 75.9%, respectively. We also created a new laryngoscopic image set for machine learning study and benchmarked the performance of 8 classical deep learning models based on CNNs and Transformers. The results can be integrated into current learning models to improve their performance by incorporating the patient's demographic information.
翻译:柔性喉镜检查是耳鼻喉科医生常用的检查手段,用于检测喉部疾病并识别潜在恶性病变。近年来,研究者引入机器学习技术,利用喉镜图像实现自动化诊断并取得了良好效果。将患者人口统计学信息纳入模型后,诊断性能可进一步提升。然而,临床医生手动输入患者数据耗时较长。本研究首次尝试运用深度学习模型预测患者人口统计学信息,以提升检测器模型性能。其中,性别、吸烟史和年龄预测的总体准确率分别达到85.5%、65.2%和75.9%。此外,我们还创建了一个用于机器学习研究的新型喉镜图像数据集,并基于CNN和Transformer对8种经典深度学习模型的性能进行了基准测试。该研究成果可集成至现有学习模型中,通过纳入患者人口统计学信息来提升其性能。