Early diagnosis of Alzheimer's Disease (AD) is crucial for delaying its progression. While AI-based speech detection is non-invasive and cost-effective, it faces a critical data efficiency dilemma due to medical data scarcity and privacy barriers. Therefore, we propose FAL-AD, a novel framework that synergistically integrates federated learning with data augmentation to systematically optimize data efficiency. Our approach delivers three key breakthroughs: First, absolute efficiency improvement through voice conversion-based augmentation, which generates diverse pathological speech samples via cross-category voice-content recombination. Second, collaborative efficiency breakthrough via an adaptive federated learning paradigm, maximizing cross-institutional benefits under privacy constraints. Finally, representational efficiency optimization by an attentive cross-modal fusion model, which achieves fine-grained word-level alignment and acoustic-textual interaction. Evaluated on ADReSSo, FAL-AD achieves a state-of-the-art multi-modal accuracy of 91.52%, outperforming all centralized baselines and demonstrating a practical solution to the data efficiency dilemma. Our source code is publicly available at https://github.com/smileix/fal-ad.
翻译:阿尔茨海默病的早期诊断对于延缓其进展至关重要。基于人工智能的语音检测方法虽具有无创性和成本效益,但由于医疗数据稀缺和隐私壁垒,面临严峻的数据效率困境。为此,我们提出FAL-AD,一种创新框架,通过协同整合联邦学习与数据增强技术,系统性优化数据效率。我们的方法实现了三项关键突破:首先,通过基于语音转换的增强技术实现绝对效率提升,该技术通过跨类别语音-内容重组生成多样化的病理语音样本。其次,通过自适应联邦学习范式实现协作效率突破,在隐私约束下最大化跨机构效益。最后,通过注意力跨模态融合模型优化表征效率,实现细粒度词级对齐与声学-文本交互。在ADReSSo数据集上的评估表明,FAL-AD取得了91.52%的多模态准确率,达到最先进水平,优于所有集中式基线方法,为数据效率困境提供了切实可行的解决方案。我们的源代码已公开于https://github.com/smileix/fal-ad。