The limited availability of dysarthric speech data makes cross-lingual detection an important but challenging problem. A key difficulty is that speech representations often encode language-dependent structure that can confound dysarthria detection. We propose a representation-level language shift (LS) that aligns source-language self-supervised speech representations with the target-language distribution using centroid-based vector adaptation estimated from healthy-control speech. We evaluate the approach on oral DDK recordings from Parkinson's disease speech datasets in Czech, German, and Spanish under both cross-lingual and multilingual settings. LS substantially improves sensitivity and F1 in cross-lingual settings, while yielding smaller but consistent gains in multilingual settings. Representation analysis further shows that LS reduces language identity in the embedding space, supporting the interpretation that LS removes language-dependent structure.
翻译:构音障碍语音数据的有限性使得跨语言检测成为一个重要但具有挑战性的问题。其中一个关键难点在于,语音表征通常编码了与语言相关的结构,这可能会干扰构音障碍的检测。我们提出了一种表征级语言迁移方法,通过从健康对照组语音中估计基于质心的向量适配方法,将源语言的自监督语音表征与目标语言分布对齐。我们在捷克语、德语和西班牙语的帕金森病语音数据集的口部DDK录音上,分别在跨语言和多语言设置下评估该方法。在跨语言设置下,语言迁移显著提高了敏感性和F1分数,而在多语言设置下则带来了较小但一致的提升。表征分析进一步表明,语言迁移减少了嵌入空间中的语言身份信息,支持了该方法能够去除语言相关结构的解释。