Target volume contouring for radiation therapy is considered significantly more challenging than the normal organ segmentation tasks as it necessitates the utilization of both image and text-based clinical information. Inspired by the recent advancement of large language models (LLMs) that can facilitate the integration of the textural information and images, here we present a novel LLM-driven multi-modal AI that utilizes the clinical text information and is applicable to the challenging task of target volume contouring for radiation therapy, and validate it within the context of breast cancer radiation therapy target volume contouring. Using external validation and data-insufficient environments, which attributes highly conducive to real-world applications, we demonstrate that the proposed model exhibits markedly improved performance compared to conventional vision-only AI models, particularly exhibiting robust generalization performance and data-efficiency. To our best knowledge, this is the first LLM-driven multimodal AI model that integrates the clinical text information into target volume delineation for radiation oncology.
翻译:放射治疗中的靶区勾画被认为比正常器官分割任务更具挑战性,因为它需要同时利用基于图像和文本的临床信息。受大型语言模型(LLM)最新进展的启发——该技术能够促进文本信息与图像的整合——我们提出了一种新颖的LLM驱动的多模态人工智能,该模型利用临床文本信息,适用于放射治疗中具有挑战性的靶区勾画任务,并在乳腺癌放射治疗靶区勾画场景中验证其有效性。通过使用外部验证和数据不足环境(这些属性高度契合实际应用场景),我们证明:与传统纯视觉AI模型相比,所提模型表现出显著提升的性能,尤其在鲁棒泛化能力与数据效率方面表现突出。据我们所知,这是首个将临床文本信息整合到放射肿瘤学靶区勾画中的LLM驱动多模态AI模型。