This paper explores the evolving relationship between clinician trust in LLMs, the transformation of data sources from predominantly human-generated to AI-generated content, and the subsequent impact on the precision of LLMs and clinician competence. One of the primary concerns identified is the potential feedback loop that arises as LLMs become more reliant on their outputs for learning, which may lead to a degradation in output quality and a reduction in clinician skills due to decreased engagement with fundamental diagnostic processes. While theoretical at this stage, this feedback loop poses a significant challenge as the integration of LLMs in healthcare deepens, emphasizing the need for proactive dialogue and strategic measures to ensure the safe and effective use of LLM technology. A key takeaway from our investigation is the critical role of user expertise and the necessity for a discerning approach to trusting and validating LLM outputs. The paper highlights how expert users, particularly clinicians, can leverage LLMs to enhance productivity by offloading routine tasks while maintaining a critical oversight to identify and correct potential inaccuracies in AI-generated content. This balance of trust and skepticism is vital for ensuring that LLMs augment rather than undermine the quality of patient care. Moreover, we delve into the potential risks associated with LLMs' self-referential learning loops and the deskilling of healthcare professionals. The risk of LLMs operating within an echo chamber, where AI-generated content feeds into the learning algorithms, threatens the diversity and quality of the data pool, potentially entrenching biases and reducing the efficacy of LLMs.
翻译:本文探讨了临床医生对大型语言模型信任度的演变关系、数据源从以人类生成内容为主向人工智能生成内容的转变,以及这一转变对大型语言模型精准性和临床医生能力产生的后续影响。研究识别的主要关切之一是,当大型语言模型越来越依赖自身输出进行学习时可能产生的反馈循环——这会导致输出质量下降,同时因临床医生减少参与基础诊断流程而导致其技能退化。尽管目前仍处于理论阶段,但该反馈循环随着大型语言模型在医疗领域的深度融合构成了重大挑战,凸显了开展前瞻性对话和实施战略措施以保障大型语言模型技术安全有效应用的紧迫性。本研究的核心发现在于用户专业知识的决定性作用,以及采用审慎态度信任和验证大型语言模型输出的必要性。论文强调,专家用户(尤其是临床医生)可通过将常规任务外包给大型语言模型来提升生产力,同时保持关键性监督以识别并纠正人工智能生成内容中的潜在偏差。这种信任与质疑的平衡对于确保大型语言模型增强而非削弱患者护理质量至关重要。此外,我们深入剖析了大型语言模型自我参照学习循环与医疗专业人员技能退化相关的潜在风险。当大型语言模型在回音室效应中运行时——即人工智能生成内容反馈至学习算法,将威胁数据池的多样性和质量,可能固化偏见并降低大型语言模型的效能。