Early-stage Parkinson's disease (EarlyPD) detection from speech is clinically meaningful yet underexplored, and published results are hard to compare because studies differ in datasets, languages, tasks, evaluation protocols, and EarlyPD definitions. To address this issue, we propose the first benchmark for speech-based EarlyPD detection, with a speaker-independent split designed for fair and replicable cross-method evaluation on researcher-accessible datasets. The benchmark covers three common speech tasks and evaluates methods under different training-resource settings. We also present multi-dimensional evaluation breakdowns by dataset, aggregation level, gender, and disease stage to support fine-grained comparisons and clinical adoption. Our results provide a replicable reference and actionable insights, encouraging the adoption of this publicly available benchmark to advance robust and clinically meaningful EarlyPD detection from speech.
翻译:从语音中检测早期帕金森病(EarlyPD)具有临床意义却尚未得到充分探索,且已发表的结果难以比较,因为不同研究在数据集、语言、任务、评估协议以及早期帕金森病定义上存在差异。为解决这一问题,我们提出了首个基于语音的早期帕金森病检测基准,该基准采用说话人独立划分,旨在对研究者可获取的数据集进行公平且可复现的跨方法评估。该基准涵盖三种常见语音任务,并在不同训练资源设置下评估方法。我们还提供了按数据集、聚合级别、性别和疾病阶段划分的多维度评估分解,以支持细粒度比较和临床应用。我们的结果为推进稳健且具有临床意义的语音早期帕金森病检测提供了可复现的参考和可行的见解,鼓励采用这一公开基准。