This perspective calls for scholars across disciplines to address the challenge of audio deepfake detection and discernment through an interdisciplinary lens across Artificial Intelligence methods and linguistics. With an avalanche of tools for the generation of realistic-sounding fake speech on one side, the detection of deepfakes is lagging on the other. Particularly hindering audio deepfake detection is the fact that current AI models lack a full understanding of the inherent variability of language and the complexities and uniqueness of human speech. We see the promising potential in recent transdisciplinary work that incorporates linguistic knowledge into AI approaches to provide pathways for expert-in-the-loop and to move beyond expert agnostic AI-based methods for more robust and comprehensive deepfake detection.
翻译:本文呼吁各学科学者通过人工智能方法与语言学的跨学科视角,共同应对音频深度伪造检测与识别的挑战。一方面,生成逼真伪造语音的工具呈爆发式增长;另一方面,深度伪造检测技术却相对滞后。当前AI模型对语言内在可变性及人类语音复杂性与独特性的理解尚不全面,这尤其阻碍了音频深度伪造检测的发展。我们注意到近期跨学科研究展现出巨大潜力——通过将语言学知识融入AI方法,既为专家参与式检测提供路径,又能超越专家无关的纯AI方法,从而实现更稳健、更全面的深度伪造检测。