Rare gastrointestinal lesions are infrequently encountered in routine endoscopy, restricting the data available for developing reliable artificial intelligence (AI) models and training novice clinicians. Here we present EndoRare, a one-shot, retraining-free generative framework that synthesizes diverse, high-fidelity lesion exemplars from a single reference image. By leveraging language-guided concept disentanglement, EndoRare separates pathognomonic lesion features from non-diagnostic attributes, encoding the former into a learnable prototype embedding while varying the latter to ensure diversity. We validated the framework across four rare pathologies (calcifying fibrous tumor, juvenile polyposis syndrome, familial adenomatous polyposis, and Peutz-Jeghers syndrome). Synthetic images were judged clinically plausible by experts and, when used for data augmentation, significantly enhanced downstream AI classifiers, improving the true positive rate at low false-positive rates. Crucially, a blinded reader study demonstrated that novice endoscopists exposed to EndoRare-generated cases achieved a 0.400 increase in recall and a 0.267 increase in precision. These results establish a practical, data-efficient pathway to bridge the rare-disease gap in both computer-aided diagnostics and clinical education.
翻译:罕见胃肠道病变在常规内镜检查中较少出现,这限制了开发可靠人工智能(AI)模型和培训新手临床医生所需的数据量。本文提出EndoRare,一种无需重新训练的一次性生成框架,能够从单张参考图像合成多样化、高保真的病变示例。通过利用语言引导的概念解耦,EndoRare将具有病理诊断意义的病变特征与非诊断属性分离,将前者编码为可学习的原型嵌入,同时改变后者以确保多样性。我们在四种罕见病理类型(钙化性纤维性肿瘤、幼年性息肉病综合征、家族性腺瘤性息肉病和黑斑息肉综合征)上验证了该框架。专家判定合成图像具有临床合理性,且当用于数据增强时,能显著提升下游AI分类器的性能,在低假阳性率下提高了真阳性率。关键的是,一项盲法阅片研究表明,接触EndoRare生成病例的新手内镜医师在召回率上提升了0.400,精确度提升了0.267。这些结果为弥合计算机辅助诊断和临床教育中罕见疾病的数据缺口,建立了一条实用且数据高效的途径。