Background: The integration of artificial intelligence into medicine has led to significant advances, particularly in diagnostics and treatment planning. However, the reliability of AI models is highly dependent on the quality of the training data, especially in medical imaging, where varying patient data and evolving medical knowledge pose a challenge to the accuracy and generalizability of given datasets. Results: The proposed approach focuses on the integration and enhancement of clinical computed tomography (CT) image series for better findability, accessibility, interoperability, and reusability. Through an automated indexing process, CT image series are semantically enhanced using the TotalSegmentator framework for segmentation and resulting SNOMED CT annotations. The metadata is standardized with HL7 FHIR resources to enable efficient data recognition and data exchange between research projects. Conclusions: The study successfully integrates a robust process within the UKSH MeDIC, leading to the semantic enrichment of over 230,000 CT image series and over 8 million SNOMED CT annotations. The standardized representation using HL7 FHIR resources improves discoverability and facilitates interoperability, providing a foundation for the FAIRness of medical imaging data. However, developing automated annotation methods that can keep pace with growing clinical datasets remains a challenge to ensure continued progress in large-scale integration and indexing of medical imaging for advanced healthcare AI applications.
翻译:背景:人工智能与医学的融合已带来显著进展,尤其在诊断与治疗规划领域。然而,AI模型的可靠性高度依赖于训练数据的质量,特别是在医学影像中,患者数据的多样性和不断演进的医学知识对现有数据集的准确性与泛化能力构成挑战。结果:本研究提出的方法聚焦于临床计算机断层扫描(CT)影像序列的整合与增强,以提升其可发现性、可访问性、互操作性和可重用性。通过自动化索引流程,利用TotalSegmentator框架对CT影像序列进行分割,并生成相应的SNOMED CT标注,从而实现语义增强。元数据采用HL7 FHIR资源进行标准化,以支持研究项目间的高效数据识别与交换。结论:本研究成功在UKSH MeDIC中集成了稳健的处理流程,完成了超过23万例CT影像序列的语义增强,并生成超过800万条SNOMED CT标注。采用HL7 FHIR资源的标准化表示提升了数据的可发现性并促进了互操作性,为医学影像数据的FAIR原则遵循奠定了基础。然而,开发能够跟上临床数据集增长步伐的自动化标注方法仍面临挑战,这对于确保医学影像在大规模整合与索引方面持续进步,以支持先进医疗AI应用至关重要。