Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining - 专知论文

会员服务 ·

0

负样本 · 样本 · Softmax · 低秩逼近 · 梯度 ·

2023 年 3 月 27 日

Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

翻译：通过动态索引负采样提升双编码器训练

Nicholas Monath,Manzil Zaheer,Kelsey Allen,Andrew McCallum

from arxiv, To appear at AISTATS 2023

Dual encoder models are ubiquitous in modern classification and retrieval. Crucial for training such dual encoders is an accurate estimation of gradients from the partition function of the softmax over the large output space; this requires finding negative targets that contribute most significantly ("hard negatives"). Since dual encoder model parameters change during training, the use of traditional static nearest neighbor indexes can be sub-optimal. These static indexes (1) periodically require expensive re-building of the index, which in turn requires (2) expensive re-encoding of all targets using updated model parameters. This paper addresses both of these challenges. First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree. Second, we approximate the effect of a gradient update on target encodings with an efficient Nystrom low-rank approximation. In our empirical study on datasets with over twenty million targets, our approach cuts error by half in relation to oracle brute-force negative mining. Furthermore, our method surpasses prior state-of-the-art while using 150x less accelerator memory.

翻译：双编码器模型在现代分类与检索任务中无处不在。训练这类双编码器的关键在于准确估计大输出空间上softmax函数配分函数的梯度，这需要寻找贡献最大的负目标（即"困难负样本"）。由于双编码器模型参数在训练过程中持续更新，使用传统静态最近邻索引可能并非最优选择。这些静态索引存在两大问题：(1) 需要定期重建索引，这本身成本高昂；(2) 重建后必须使用更新后的模型参数对所有目标进行重新编码，同样代价昂贵。本文针对这两个挑战提出解决方案。首先，我们提出一种基于树结构的算法，该算法能以可证明的边界逼近softmax函数，并动态维护树结构。其次，我们通过高效的Nyström低秩近似来模拟梯度更新对目标编码的影响。在包含超过两千万个目标的数据集上进行的实证研究表明，相比暴力穷举的负采样基准方法，我们的方法能将错误率降低一半。此外，本方法在超越先前最优成果的同时，可将加速器内存使用量降低150倍。

0

相关内容

负样本

对比学习需要哪样的数据？UCLA最新ICML2023论文《数据高效对比学习：简单样本贡献最大》，探究量化样本对SSL的贡献度

对比学习需要哪样的数据？UCLA最新ICML2023论文《数据高效对比学习：简单样本贡献最大》，探究量化样本对SSL的贡献度

专知会员服务

37+阅读 · 2023年5月14日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【SIGIR2021】使用难样本优化向量检索模型

专知会员服务

27+阅读 · 2021年4月22日

近期必读的六篇ICLR 2021【对比学习（CL）】相关论文和代码

专知会员服务

26+阅读 · 2021年3月2日

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

专知会员服务

71+阅读 · 2020年4月20日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

专知会员服务

32+阅读 · 2019年12月26日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

3+阅读 · 2022年7月26日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

基于对象模型与多点空间统计的高分辨率遥感影像分类策略

国家自然科学基金

4+阅读 · 2015年12月31日

无人机视频快速4-D重建及时空自适应索引方法研究

国家自然科学基金

8+阅读 · 2015年12月31日

基于网络的复杂疾病动态表观修饰模块挖掘

国家自然科学基金

0+阅读 · 2015年12月31日

基于多源遥感数据的森林生物量估算与空间尺度转换研究

国家自然科学基金

0+阅读 · 2013年12月31日

图像复原问题尺度自适应性关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

干旱区典型流域极端洪水时空演变与预测研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于云的免疫检测器训练和动态更新算法及其在网络安全态势感知系统中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据的图模型学习与统计推断

国家自然科学基金

8+阅读 · 2012年12月31日

养心通脉有效部位方诱导骨髓间充质干细胞分化过程中miRNA调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

流域水文极端事件时空演变特征及其对气候变化的响应机理

国家自然科学基金

0+阅读 · 2012年12月31日

Error-Correcting Codes for Nanopore Sequencing

Arxiv

0+阅读 · 2023年5月17日

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

Arxiv

0+阅读 · 2023年5月17日

RPTQ: Reorder-based Post-training Quantization for Large Language Models

Arxiv

0+阅读 · 2023年5月17日

Meta-optimized Contrastive Learning for Sequential Recommendation

Arxiv

0+阅读 · 2023年5月17日

Clustering-Aware Negative Sampling for Unsupervised Sentence Representation

Arxiv

0+阅读 · 2023年5月17日

Conditional variational autoencoder with Gaussian process regression recognition for parametric models

Conditional variational autoencoder with Gaussian process regression recognition for parametric models

Arxiv

0+阅读 · 2023年5月16日

Combining datasets to increase the number of samples and improve model fitting

Arxiv

0+阅读 · 2023年5月16日

Component Training of Turbo Autoencoders

Arxiv

0+阅读 · 2023年5月16日

Improved baselines for vision-language pre-training

Arxiv

0+阅读 · 2023年5月15日

Optimal Reads-From Consistency Checking for C11-Style Memory Models

Arxiv

0+阅读 · 2023年5月12日

VIP会员

文章信息

相关主题

最新内容

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

1+阅读 · 今天2:42

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

1+阅读 · 今天2:37

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

2+阅读 · 今天2:23

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

5+阅读 · 今天2:21

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

2+阅读 · 今天1:46

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

5+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

4+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

3+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

4+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

2+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

11+阅读 · 7月31日

《美战争部指令文件：网络空间效应与使能能力测试评估》

《美战争部指令文件：网络空间效应与使能能力测试评估》

专知会员服务

8+阅读 · 7月31日

《史诗怒火行动：多域前瞻评估》49页报告

《史诗怒火行动：多域前瞻评估》49页报告

专知会员服务

7+阅读 · 7月31日

《英国防部：未来空战系统数字化战略》33页

《英国防部：未来空战系统数字化战略》33页

专知会员服务

5+阅读 · 7月31日

《面向自主飞行网络的智能体人工智能架构》

《面向自主飞行网络的智能体人工智能架构》

专知会员服务

7+阅读 · 7月31日

相关VIP内容

对比学习需要哪样的数据？UCLA最新ICML2023论文《数据高效对比学习：简单样本贡献最大》，探究量化样本对SSL的贡献度

对比学习需要哪样的数据？UCLA最新ICML2023论文《数据高效对比学习：简单样本贡献最大》，探究量化样本对SSL的贡献度

专知会员服务

37+阅读 · 2023年5月14日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【SIGIR2021】使用难样本优化向量检索模型

专知会员服务

27+阅读 · 2021年4月22日

近期必读的六篇ICLR 2021【对比学习（CL）】相关论文和代码

专知会员服务

26+阅读 · 2021年3月2日

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

【CVPR2020-Oral】无监督域内自适应语义分割，Unsupervised Intra-domain Adaptation

专知会员服务

71+阅读 · 2020年4月20日

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

【ICLR2020】用实对二进制卷积训练二进制神经网络，Training Binary Neural Networks with Real-to-Binary Convolutions

专知会员服务

26+阅读 · 2020年3月26日

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

【WSDM 2020】RecVAE:一种新的变分自编码器，用于具有隐式反馈的Top-N推荐（RecVAE: a New Variational Autoencoder for Top-NRecommendations with Implicit Feedback）

专知会员服务

32+阅读 · 2019年12月26日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

从采集到决策：美军视角下的战术情报范式重构

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

3+阅读 · 2022年7月26日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Error-Correcting Codes for Nanopore Sequencing

Arxiv

0+阅读 · 2023年5月17日

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

Arxiv

0+阅读 · 2023年5月17日

RPTQ: Reorder-based Post-training Quantization for Large Language Models

Arxiv

0+阅读 · 2023年5月17日

Meta-optimized Contrastive Learning for Sequential Recommendation

Arxiv

0+阅读 · 2023年5月17日

Clustering-Aware Negative Sampling for Unsupervised Sentence Representation

Arxiv

0+阅读 · 2023年5月17日

Conditional variational autoencoder with Gaussian process regression recognition for parametric models

Conditional variational autoencoder with Gaussian process regression recognition for parametric models

Arxiv

0+阅读 · 2023年5月16日

Combining datasets to increase the number of samples and improve model fitting

Arxiv

0+阅读 · 2023年5月16日

Component Training of Turbo Autoencoders

Arxiv

0+阅读 · 2023年5月16日

Improved baselines for vision-language pre-training

Arxiv

0+阅读 · 2023年5月15日

Optimal Reads-From Consistency Checking for C11-Style Memory Models

Arxiv

0+阅读 · 2023年5月12日

相关基金

基于对象模型与多点空间统计的高分辨率遥感影像分类策略

国家自然科学基金

4+阅读 · 2015年12月31日

无人机视频快速4-D重建及时空自适应索引方法研究

国家自然科学基金

8+阅读 · 2015年12月31日

基于网络的复杂疾病动态表观修饰模块挖掘

国家自然科学基金

0+阅读 · 2015年12月31日

基于多源遥感数据的森林生物量估算与空间尺度转换研究

国家自然科学基金

0+阅读 · 2013年12月31日

图像复原问题尺度自适应性关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

干旱区典型流域极端洪水时空演变与预测研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于云的免疫检测器训练和动态更新算法及其在网络安全态势感知系统中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据的图模型学习与统计推断

国家自然科学基金

8+阅读 · 2012年12月31日

养心通脉有效部位方诱导骨髓间充质干细胞分化过程中miRNA调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

流域水文极端事件时空演变特征及其对气候变化的响应机理

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员