In this paper the current status and open challenges of synthetic speech detection are addressed. The work comprises an initial analysis of available open datasets and of existing detection methods, a description of the requirements for new research datasets compliant with regulations and better representing real-case scenarios, and a discussion of the desired characteristics of future trustworthy detection methods in terms of both functional and non-functional requirements. Compared to other works, based on specific detection solutions or presenting single dataset of synthetic speeches, our paper is meant to orient future state-of-the-art research in the domain, to quickly lessen the current gap between synthesis and detection approaches.
翻译:本文探讨了合成语音检测的现状与开放挑战。研究工作包括对现有开放数据集和现有检测方法的初步分析,描述了符合法规要求且能更好代表真实场景的新研究数据集所需的条件,并从功能性和非功能性需求两方面讨论了未来可信检测方法应具备的理想特性。与其他基于特定检测解决方案或仅展示单一合成语音数据集的研究不同,本文旨在为该领域的未来前沿研究提供方向,以迅速缩小当前合成方法与检测方法之间的差距。