A Training-Efficient Transformer-Based Anti-Spoofing Network for Logical Access in ASVspoof 5 - 专知论文

会员服务 ·

0

损失 · 系统 · 排序 · Transformer · 精度 ·

A Training-Efficient Transformer-Based Anti-Spoofing Network for Logical Access in ASVspoof 5

翻译：一种高效训练的基于Transformer的反欺骗网络用于ASVspoof 5逻辑访问任务

Sidan Yin,Bo Zhao

from arxiv, 11 pages, 2 figures

Synthetic and manipulated speech can reduce the reliability of automatic speaker verification systems, so anti-spoofing methods need to be both accurate and efficient in training and inference. This paper focuses on the ASVspoof 5 Track 1 closed condition, where standard cross-entropy training may not give enough attention to hard trials and is not directly aligned with ranking- and threshold-based evaluation metrics. We propose TFPARN, a Transformer-based focal-pairwise attentive ranking network. The system extracts log-Mel features from speech, uses a Transformer encoder to model frame-level information, applies attention pooling to obtain utterance-level representations, and is trained with a combination of focal classification loss and pairwise ranking loss. RawBoost augmentation is used during training, and test-time augmentation is applied during evaluation to improve robustness. Compared with re-implemented AASIST and RawNet2 baselines under the same protocol, TFPARN achieves the best results, with a minDCF of 0.2430 and an EER of 12.52%. Ablation experiments further show that the pairwise loss, focal loss, and attention pooling all improve performance. TFPARN also uses the lowest inference memory among the compared systems, at 1.4 GB, runs at about 0.79 ms per utterance, and reaches its best checkpoint in less training time than AASIST. These results show that TFPARN provides a good balance between detection accuracy and computational cost for logical access anti-spoofing.

翻译：合成及篡改语音会降低自动说话人验证系统的可靠性，因此反欺骗方法需在训练与推理过程中兼具高精度与高效率。本文聚焦ASVspoof 5赛道1封闭条件，指出标准交叉熵训练可能对困难样本关注不足，且无法直接对齐基于排序与阈值的评估指标。我们提出TFPARN——一种基于Transformer的焦点-成对注意力排序网络。该系统从语音中提取对数梅尔特征，利用Transformer编码器建模帧级信息，通过注意力池化获取话语级表征，并采用焦点分类损失与成对排序损失的联合训练策略。训练阶段使用RawBoost数据增强，评估阶段应用测试时增强以提升鲁棒性。在与相同协议下重新实现的AASIST和RawNet2基线对比中，TFPARN取得最优结果：最小检测代价函数(minDCF)为0.2430，等错误率(EER)为12.52%。消融实验进一步表明，成对损失、焦点损失及注意力池化均能提升性能。TFPARN在对比系统中推理内存最低（1.4 GB），每句话处理耗时约0.79毫秒，且达到最佳检查点所需的训练时间少于AASIST。上述结果表明，TFPARN在逻辑访问反欺骗任务中实现了检测精度与计算成本的良好平衡。

0

相关内容

用Transformer学习通用超参数优化器，DeepMind Yutian Chen博士讲授，附Slides与视频

用Transformer学习通用超参数优化器，DeepMind Yutian Chen博士讲授，附Slides与视频

专知会员服务

40+阅读 · 2023年3月12日

Transformer如何训得更快更好？莫纳什大学最新《Transformer高效训练》综述，详述训练Transformer技术

Transformer如何训得更快更好？莫纳什大学最新《Transformer高效训练》综述，详述训练Transformer技术

专知会员服务

61+阅读 · 2023年2月4日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

【Google】高效Transformer综述，Efficient Transformers: A Survey

【Google】高效Transformer综述，Efficient Transformers: A Survey

专知会员服务

66+阅读 · 2022年3月17日

【WWW2020-腾讯】未来的数据有助于训练:为基于会话的推荐建立未来的上下文模型，Future Data Helps Training: Modelling Future Contexts for Session-based Recommendation

【WWW2020-腾讯】未来的数据有助于训练:为基于会话的推荐建立未来的上下文模型，Future Data Helps Training: Modelling Future Contexts for Session-based Recommendation

专知会员服务

25+阅读 · 2020年3月15日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【Amazon】使用预先训练的Transformer模型进行数据增强

【Amazon】使用预先训练的Transformer模型进行数据增强

专知会员服务

58+阅读 · 2020年3月6日

Google研究院提出FixMatch，简单粗暴却极其有效的半监督学习方法，附14页PDF下载

Google研究院提出FixMatch，简单粗暴却极其有效的半监督学习方法，附14页PDF下载

专知会员服务

54+阅读 · 2020年1月24日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

武大提出FarSeg：遥感图像分割新网络，解决前景背景不平衡问题 | CVPR 2020

武大提出FarSeg：遥感图像分割新网络，解决前景背景不平衡问题 | CVPR 2020

CVer

17+阅读 · 2020年7月10日

【Amazon】使用预训练Transformer模型进行数据增强

【Amazon】使用预训练Transformer模型进行数据增强

专知

12+阅读 · 2020年3月6日

Transformers就是图神经网络？NTU-Chaitanya Joshi论述: 是GNN的一个特例

Transformers就是图神经网络？NTU-Chaitanya Joshi论述: 是GNN的一个特例

专知

20+阅读 · 2020年3月1日

谷歌NIPS论文Transformer模型解读：只要Attention就够了

谷歌NIPS论文Transformer模型解读：只要Attention就够了

AI100

14+阅读 · 2019年9月9日

最强NLP预训练模型库PyTorch-Transformers正式开源！支持6个预训练框架，27个预训练模型

最强NLP预训练模型库PyTorch-Transformers正式开源！支持6个预训练框架，27个预训练模型

AI前线

12+阅读 · 2019年7月22日

用于语音识别的数据增强

用于语音识别的数据增强

AI研习社

24+阅读 · 2019年6月5日

TensorFlow 2.0官方Transformer教程 (Attention is All you Need)

TensorFlow 2.0官方Transformer教程 (Attention is All you Need)

专知

54+阅读 · 2019年4月12日

加入Transformer-XL，这个PyTorch包能调用各种NLP预训练模型

加入Transformer-XL，这个PyTorch包能调用各种NLP预训练模型

机器之心

15+阅读 · 2019年2月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

TextInfoExp:自然语言处理相关实验（基于sougou数据集）

TextInfoExp:自然语言处理相关实验（基于sougou数据集）

全球人工智能

12+阅读 · 2017年11月12日

基于图的半监督学习算法研究

国家自然科学基金

5+阅读 · 2015年12月31日

基于信道差异的物理层安全编码技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

非确定型Web服务流程重组的可靠性验证技术

国家自然科学基金

1+阅读 · 2015年12月31日

基于深度神经网络的雷达目标高分辨距离像稳健识别方法

国家自然科学基金

6+阅读 · 2015年12月31日

基于高维流形计算的混沌密码攻击方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于深度学习和马尔科夫逻辑网络的特殊视频识别研究

国家自然科学基金

0+阅读 · 2015年12月31日

无线传感器网络中高效的虚假数据过滤方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于概率本体的CPS入侵检测方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

社交网络环境下基于协同过滤的上下文感知推荐系统研究

国家自然科学基金

6+阅读 · 2014年12月31日

多语言大数据环境下的复杂网络行为分析、预测和干预

国家自然科学基金

4+阅读 · 2014年12月31日

teasr: training-efficient any-step diffusion transformer for real-world image super-resolution

Arxiv

0+阅读 · 6月15日

MSpoofTTS: Multi-Resolution Spoof-Guided Inference for Discrete Speech Synthesis

Arxiv

0+阅读 · 6月14日

From Self-Supervised Speech Models to Mixture-of-Experts for Robust Anti-Spoofing

Arxiv

0+阅读 · 6月12日

SpAArSIST: Sparsified AASIST for Efficient and Reliable Anti-Spoofing

Arxiv

0+阅读 · 6月10日

RAT: Reference-Augmented Training for ASV Anti-Spoofing

Arxiv

0+阅读 · 6月9日

MultiAPI Spoof: A Multi-API Dataset and Local-Attention Network for Speech Anti-spoofing Detection

Arxiv

0+阅读 · 6月9日

Speaker-Invariant Representation Learning for Spoofing Detection via Gradient Reversal and A Variational Information Bottleneck

Arxiv

0+阅读 · 6月7日

An Expanded Synthetic Conversation Dataset for Multi-Turn Smishing Detection

Arxiv

0+阅读 · 6月5日

CTFExplorer: Evaluating LLM Offensive Agents Through Multi-Target Web CTF Benchmarking

Arxiv

0+阅读 · 5月20日

An Efficient Machine Learning-based Framework for Detection and Prevention of Frauds in Telecom Networks

Arxiv

0+阅读 · 5月17日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

1+阅读 · 今天14:45

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

1+阅读 · 今天14:43

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

3+阅读 · 今天14:31

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

3+阅读 · 今天14:20

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

2+阅读 · 今天14:11

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

3+阅读 · 今天14:07

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

3+阅读 · 今天14:03

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

2+阅读 · 今天13:59

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

5+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

8+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

7+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

4+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

5+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

5+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

8+阅读 · 6月22日

相关VIP内容

用Transformer学习通用超参数优化器，DeepMind Yutian Chen博士讲授，附Slides与视频

用Transformer学习通用超参数优化器，DeepMind Yutian Chen博士讲授，附Slides与视频

专知会员服务

40+阅读 · 2023年3月12日

Transformer如何训得更快更好？莫纳什大学最新《Transformer高效训练》综述，详述训练Transformer技术

Transformer如何训得更快更好？莫纳什大学最新《Transformer高效训练》综述，详述训练Transformer技术

专知会员服务

61+阅读 · 2023年2月4日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

【Google】高效Transformer综述，Efficient Transformers: A Survey

【Google】高效Transformer综述，Efficient Transformers: A Survey

专知会员服务

66+阅读 · 2022年3月17日

【WWW2020-腾讯】未来的数据有助于训练:为基于会话的推荐建立未来的上下文模型，Future Data Helps Training: Modelling Future Contexts for Session-based Recommendation

【WWW2020-腾讯】未来的数据有助于训练:为基于会话的推荐建立未来的上下文模型，Future Data Helps Training: Modelling Future Contexts for Session-based Recommendation

专知会员服务

25+阅读 · 2020年3月15日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【Amazon】使用预先训练的Transformer模型进行数据增强

【Amazon】使用预先训练的Transformer模型进行数据增强

专知会员服务

58+阅读 · 2020年3月6日

Google研究院提出FixMatch，简单粗暴却极其有效的半监督学习方法，附14页PDF下载

Google研究院提出FixMatch，简单粗暴却极其有效的半监督学习方法，附14页PDF下载

专知会员服务

54+阅读 · 2020年1月24日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 世界动作模型：少做梦，多行动

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

美以伊冲突：无人机与人工智能的运用

相关资讯

武大提出FarSeg：遥感图像分割新网络，解决前景背景不平衡问题 | CVPR 2020

武大提出FarSeg：遥感图像分割新网络，解决前景背景不平衡问题 | CVPR 2020

CVer

17+阅读 · 2020年7月10日

【Amazon】使用预训练Transformer模型进行数据增强

【Amazon】使用预训练Transformer模型进行数据增强

专知

12+阅读 · 2020年3月6日

Transformers就是图神经网络？NTU-Chaitanya Joshi论述: 是GNN的一个特例

Transformers就是图神经网络？NTU-Chaitanya Joshi论述: 是GNN的一个特例

专知

20+阅读 · 2020年3月1日

谷歌NIPS论文Transformer模型解读：只要Attention就够了

谷歌NIPS论文Transformer模型解读：只要Attention就够了

AI100

14+阅读 · 2019年9月9日

最强NLP预训练模型库PyTorch-Transformers正式开源！支持6个预训练框架，27个预训练模型

最强NLP预训练模型库PyTorch-Transformers正式开源！支持6个预训练框架，27个预训练模型

AI前线

12+阅读 · 2019年7月22日

用于语音识别的数据增强

用于语音识别的数据增强

AI研习社

24+阅读 · 2019年6月5日

TensorFlow 2.0官方Transformer教程 (Attention is All you Need)

TensorFlow 2.0官方Transformer教程 (Attention is All you Need)

专知

54+阅读 · 2019年4月12日

加入Transformer-XL，这个PyTorch包能调用各种NLP预训练模型

加入Transformer-XL，这个PyTorch包能调用各种NLP预训练模型

机器之心

15+阅读 · 2019年2月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

TextInfoExp:自然语言处理相关实验（基于sougou数据集）

TextInfoExp:自然语言处理相关实验（基于sougou数据集）

全球人工智能

12+阅读 · 2017年11月12日

相关论文

teasr: training-efficient any-step diffusion transformer for real-world image super-resolution

Arxiv

0+阅读 · 6月15日

MSpoofTTS: Multi-Resolution Spoof-Guided Inference for Discrete Speech Synthesis

Arxiv

0+阅读 · 6月14日

From Self-Supervised Speech Models to Mixture-of-Experts for Robust Anti-Spoofing

Arxiv

0+阅读 · 6月12日

SpAArSIST: Sparsified AASIST for Efficient and Reliable Anti-Spoofing

Arxiv

0+阅读 · 6月10日

RAT: Reference-Augmented Training for ASV Anti-Spoofing

Arxiv

0+阅读 · 6月9日

MultiAPI Spoof: A Multi-API Dataset and Local-Attention Network for Speech Anti-spoofing Detection

Arxiv

0+阅读 · 6月9日

Speaker-Invariant Representation Learning for Spoofing Detection via Gradient Reversal and A Variational Information Bottleneck

Arxiv

0+阅读 · 6月7日

An Expanded Synthetic Conversation Dataset for Multi-Turn Smishing Detection

Arxiv

0+阅读 · 6月5日

CTFExplorer: Evaluating LLM Offensive Agents Through Multi-Target Web CTF Benchmarking

Arxiv

0+阅读 · 5月20日

An Efficient Machine Learning-based Framework for Detection and Prevention of Frauds in Telecom Networks

Arxiv

0+阅读 · 5月17日

相关基金

基于图的半监督学习算法研究

国家自然科学基金

5+阅读 · 2015年12月31日

基于信道差异的物理层安全编码技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

非确定型Web服务流程重组的可靠性验证技术

国家自然科学基金

1+阅读 · 2015年12月31日

基于深度神经网络的雷达目标高分辨距离像稳健识别方法

国家自然科学基金

6+阅读 · 2015年12月31日

基于高维流形计算的混沌密码攻击方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于深度学习和马尔科夫逻辑网络的特殊视频识别研究

国家自然科学基金

0+阅读 · 2015年12月31日

无线传感器网络中高效的虚假数据过滤方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于概率本体的CPS入侵检测方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

社交网络环境下基于协同过滤的上下文感知推荐系统研究

国家自然科学基金

6+阅读 · 2014年12月31日

多语言大数据环境下的复杂网络行为分析、预测和干预

国家自然科学基金

4+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员