Alzheimer's Disease (AD) is a significant and growing public health concern. Investigating alterations in speech and language patterns offers a promising path towards cost-effective and non-invasive early detection of AD on a large scale. Large language models (LLMs), such as GPT, have enabled powerful new possibilities for semantic text analysis. In this study, we leverage GPT-4 to extract five semantic features from transcripts of spontaneous patient speech. The features capture known symptoms of AD, but they are difficult to quantify effectively using traditional methods of computational linguistics. We demonstrate the clinical significance of these features and further validate one of them ("Word-Finding Difficulties") against a proxy measure and human raters. When combined with established linguistic features and a Random Forest classifier, the GPT-derived features significantly improve the detection of AD. Our approach proves effective for both manually transcribed and automatically generated transcripts, representing a novel and impactful use of recent advancements in LLMs for AD speech analysis.
翻译:阿尔茨海默病(AD)是一个日益严峻的重大公共卫生问题。研究言语和语言模式的改变,为大规模、低成本、非侵入性的AD早期检测提供了前景广阔的途径。以GPT为代表的大语言模型(LLMs)为语义文本分析开启了强大的新可能。本研究利用GPT-4从患者自发语音转录文本中提取五个语义特征。这些特征捕捉了已知的AD症状,但传统计算语言学方法难以对其进行有效量化。我们论证了这些特征的临床意义,并针对其中一个特征("找词困难")通过代理指标和人工评分进行了进一步验证。当这些GPT提取的特征与既有的语言特征结合,并采用随机森林分类器时,AD的检测性能得到显著提升。我们的方法对于人工转录和自动生成的转录文本均证明有效,这代表了LLMs最新进展在AD语音分析中的一项新颖且具有影响力的应用。