A knowledge gap persists between Machine Learning (ML) developers (e.g., data scientists) and practitioners (e.g., clinicians), hampering the full utilization of ML for clinical data analysis. We investigated the potential of the chatGPT Advanced Data Analysis (ADA), an extension of GPT-4, to bridge this gap and perform ML analyses efficiently. Real-world clinical datasets and study details from large trials across various medical specialties were presented to chatGPT ADA without specific guidance. ChatGPT ADA autonomously developed state-of-the-art ML models based on the original study's training data to predict clinical outcomes such as cancer development, cancer progression, disease complications, or biomarkers such as pathogenic gene sequences. Strikingly, these ML models matched or outperformed their published counterparts. We conclude that chatGPT ADA offers a promising avenue to democratize ML in medicine, making advanced analytics accessible to non-ML experts and promoting broader applications in medical research and practice.
翻译:在机器学习(ML)开发人员(如数据科学家)与从业者(如临床医生)之间,始终存在知识鸿沟,这阻碍了ML在临床数据分析中的全面应用。我们研究了chatGPT高级数据分析(ADA)——GPT-4的扩展功能——在弥合这一鸿沟并高效执行ML分析方面的潜力。将来自多医学专业大型试验的真实世界临床数据集与研究细节输入至chatGPT ADA,未提供具体指导。ChatGPT ADA自主开发了基于原始研究训练数据的最新ML模型,用于预测临床结局(如癌症发生、癌症进展、疾病并发症)或生物标志物(如致病基因序列)。令人瞩目的是,这些ML模型的性能与已发表的对应模型相当或更优。我们得出结论:chatGPT ADA为医学中ML的民主化提供了一条有前景的途径,使非ML专家能够使用高级分析工具,并促进其在医学研究与实践中的更广泛应用。