This paper introduces RDA, a pioneering approach designed to address two primary deficiencies prevalent in previous endeavors aiming at stealing pre-trained encoders: (1) suboptimal performances attributed to biased optimization objectives, and (2) elevated query costs stemming from the end-to-end paradigm that necessitates querying the target encoder every epoch. Specifically, we initially Refine the representations of the target encoder for each training sample, thereby establishing a less biased optimization objective before the steal-training phase. This is accomplished via a sample-wise prototype, which consolidates the target encoder's representations for a given sample's various perspectives. Demanding exponentially fewer queries compared to the end-to-end approach, prototypes can be instantiated to guide subsequent query-free training. For more potent efficacy, we develop a multi-relational extraction loss that trains the surrogate encoder to Discriminate mismatched embedding-prototype pairs while Aligning those matched ones in terms of both amplitude and angle. In this way, the trained surrogate encoder achieves state-of-the-art results across the board in various downstream datasets with limited queries. Moreover, RDA is shown to be robust to multiple widely-used defenses.
翻译:本文提出了RDA方法,旨在解决先前窃取预训练编码器研究中普遍存在的两个主要缺陷:(1) 因优化目标偏差导致的性能欠佳;(2) 端到端范式需每轮查询目标编码器而产生的高昂查询成本。具体而言,我们首先针对每个训练样本精炼目标编码器的表征,从而在窃取训练阶段前建立偏差更小的优化目标。这是通过样本级原型实现的,该原型整合了目标编码器对给定样本多视角的表征。相较于端到端方法,原型构建所需查询量呈指数级减少,并可实例化以指导后续无需查询的训练。为获得更强效能,我们设计了多关系提取损失函数,使代理编码器能够判别不匹配的嵌入-原型对,同时在幅度和角度上对齐匹配的嵌入-原型对。通过这种方式,训练后的代理编码器在有限查询次数下,于多个下游数据集上均取得了最先进的结果。此外,实验表明RDA对多种广泛使用的防御机制具有鲁棒性。