Speech anonymization and de-identification have garnered significant attention recently, especially in the healthcare area including telehealth consultations, patient voiceprint matching, and patient real-time monitoring. Speaker identity classification tasks, which involve recognizing specific speakers from audio to learn identity features, are crucial for de-identification. Since rare studies have effectively combined speech anonymization with identity classification, we propose SAIC - an innovative pipeline for integrating Speech Anonymization and Identity Classification. SAIC demonstrates remarkable performance and reaches state-of-the-art in the speaker identity classification task on the Voxceleb1 dataset, with a top-1 accuracy of 96.1%. Although SAIC is not trained or evaluated specifically on clinical data, the result strongly proves the model's effectiveness and the possibility to generalize into the healthcare area, providing insightful guidance for future work.
翻译:语音匿名化及去标识化技术近年来备受关注,尤其是在远程医疗咨询、患者声纹匹配及患者实时监测等医疗健康领域。说话人身份分类任务——即通过音频识别特定说话人以学习身份特征——对于去标识化至关重要。鉴于鲜有研究有效结合语音匿名化与身份分类,我们提出SAIC——一种创新的语音匿名化与身份分类集成流水线。SAIC在Voxceleb1数据集上的说话人身份分类任务中展现了卓越性能并达到当前最优水平,Top-1准确率达96.1%。尽管SAIC未针对临床数据进行专门训练或评估,该结果有力证明了模型的有效性及其向医疗健康领域推广的可行性,为未来研究提供了启发性指导。