MixBCT: Towards Self-Adapting Backward-Compatible Training

The exponential growth of data, alongside advancements in model structures and loss functions, has necessitated the enhancement of image retrieval systems through the utilization of new models with superior feature embeddings. However, the expensive process of updating the old retrieval database by replacing embeddings poses a challenge. As a solution, backward-compatible training can be employed to avoid the necessity of updating old retrieval datasets. While previous methods achieved backward compatibility by aligning prototypes of the old model, they often overlooked the distribution of the old features, thus limiting their effectiveness when the old model's low quality leads to a weakly discriminative feature distribution. On the other hand, instance-based methods like L2 regression take into account the distribution of old features but impose strong constraints on the performance of the new model itself. In this paper, we propose MixBCT, a simple yet highly effective backward-compatible training method that serves as a unified framework for old models of varying qualities. Specifically, we summarize four constraints that are essential for ensuring backward compatibility in an ideal scenario, and we construct a single loss function to facilitate backward-compatible training. Our approach adaptively adjusts the constraint domain for new features based on the distribution of the old embeddings. We conducted extensive experiments on the large-scale face recognition datasets MS1Mv3 and IJB-C to verify the effectiveness of our method. The experimental results clearly demonstrate its superiority over previous methods. Code is available at https://github.com/yuleung/MixBCT

翻译：数据呈指数级增长，伴随着模型结构与损失函数的持续改进，促使图像检索系统需通过采用具备更优特征嵌入的新模型来提升性能。然而，更新旧检索数据库中嵌入向量的过程代价高昂，这成为一项挑战。为此，向后兼容训练可作为解决方案，从而避免更新旧检索数据集的必要性。现有方法虽通过对齐旧模型的原型实现向后兼容，但往往忽略旧特征的分布特性，当旧模型质量较低导致其特征分布鉴别性不足时，其有效性会受限。另一方面，基于实例的方法（如L2回归）虽能考虑旧特征分布，却会对新模型自身的性能施加过强约束。本文提出MixBCT——一种简单而高效的向后兼容训练方法，可作为不同质量旧模型的统一框架。具体而言，我们归纳了理想场景下确保向后兼容所需的四个约束，并构建单一损失函数以促进向后兼容训练。该方法能根据旧嵌入向量的分布特性自适应调整新特征的约束域。我们在大规模人脸识别数据集MS1Mv3与IJB-C上开展大量实验验证方法有效性，实验结果明确表明其性能优于现有方法。代码开源于 https://github.com/yuleung/MixBCT

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日