Machine-learning algorithms can facilitate low-cost, user-guided visual diagnostic platforms for addressing disparities in access to sexual health services. We developed a clinical image dataset using original and augmented images for five penile diseases: herpes eruption, syphilitic chancres, penile candidiasis, penile cancer, and genital warts. We used a U-net architecture model for semantic pixel segmentation into background or subject image, the Inception-ResNet version 2 neural architecture to classify each pixel as diseased or non-diseased, and a salience map using GradCAM++. We trained the model on a random 91% sample of the image database using 150 epochs per image, and evaluated the model on the remaining 9% of images, assessing recall (or sensitivity), precision, specificity, and F1-score (accuracy). Of the 239 images in the validation dataset, 45 (18.8%) were of genital warts, 43 (18.0%) were of HSV infection, 29 (12.1%) were of penile cancer, 40 (16.7%) were of penile candidiasis, 37 (15.5%) were of syphilitic chancres, and 45 (18.8%) were of non-diseased penises. The overall accuracy of the model for correctly classifying the diseased image was 0.944. Between July 1st and October 1st 2023, there were 2,640 unique users of the mobile platform. Among a random sample of submissions (n=437), 271 (62.0%) were from the United States, 64 (14.6%) from Singapore, 41 (9.4%) from Candia, 40 (9.2%) from the United Kingdom, and 21 (4.8%) from Vietnam. The majority (n=277 [63.4%]) were between 18 and 30 years old. We report on the development of a machine-learning model for classifying five penile diseases, which demonstrated excellent performance on a validation dataset. That model is currently in use globally and has the potential to improve access to diagnostic services for penile diseases.
翻译:机器学习算法能够构建低成本、用户引导的视觉诊断平台,以解决性健康服务可及性的不均衡问题。我们利用原始图像与增强图像构建了包含五种阴茎疾病的临床图像数据集:疱疹皮损、梅毒硬下疳、阴茎念珠菌病、阴茎癌及尖锐湿疣。采用U-net架构模型进行语义像素分割(背景或目标图像),使用Inception-ResNet v2神经架构对每个像素进行患病/非患病分类,并通过GradCAM++生成显著性热力图。模型训练采用图像数据库随机91%样本,每张图像迭代150个周期,剩余9%图像用于评估,计算召回率(灵敏度)、精确率、特异度及F1分数(准确率)。验证数据集239张图像中,尖锐湿疣45幅(18.8%)、单纯疱疹病毒感染者43幅(18.0%)、阴茎癌29幅(12.1%)、念珠菌病40幅(16.7%)、梅毒硬下疳37幅(15.5%)、正常阴茎45幅(18.8%)。模型对病变图像正确分类的整体准确率为0.944。2023年7月1日至10月1日期间,移动平台共有2640名独立用户。随机抽样提交数据(n=437)中,美国271例(62.0%)、新加坡64例(14.6%)、加拿大41例(9.4%)、英国40例(9.2%)、越南21例(4.8%),使用者年龄主要集中在18-30岁(277例[63.4%])。我们报告了用于五种阴茎疾病分类的机器学习模型开发过程,该模型在验证数据集上表现出优异性能,目前已实现全球应用,有望改善阴茎疾病诊断服务的可及性。