In this work we explore how language models can be employed to analyze language and discriminate between mentally impaired and healthy subjects through the perplexity metric. Perplexity was originally conceived as an information-theoretic measure to assess how much a given language model is suited to predict a text sequence or, equivalently, how much a word sequence fits into a specific language model. We carried out an extensive experimentation with the publicly available data, and employed language models as diverse as N-grams, from 2-grams to 5-grams, and GPT-2, a transformer-based language model. We investigated whether perplexity scores may be used to discriminate between the transcripts of healthy subjects and subjects suffering from Alzheimer Disease (AD). Our best performing models achieved full accuracy and F-score (1.00 in both precision/specificity and recall/sensitivity) in categorizing subjects from both the AD class and control subjects. These results suggest that perplexity can be a valuable analytical metrics with potential application to supporting early diagnosis of symptoms of mental disorders.
翻译:本研究探索了如何利用语言模型通过困惑度指标分析语言并区分精神障碍患者与健康受试者。困惑度最初被构想为一种信息论度量,用于评估特定语言模型预测文本序列的适宜程度,或者等价地衡量词序列与特定语言模型的匹配度。我们利用公开可用数据进行了广泛的实验,采用了多样化的语言模型,包括从2-gram到5-gram的N-gram模型,以及基于Transformer的语言模型GPT-2。我们研究了困惑度分数是否可用于区分健康受试者与阿尔茨海默病(AD)患者的转录文本。我们表现最佳的模型在分类AD类受试者和对照受试者时达到了完全的准确率和F值(精确率/特异性和召回率/灵敏度均为1.00)。这些结果表明,困惑度可成为一种有价值的分析指标,具有支持精神障碍症状早期诊断的潜在应用价值。