Among adversarial attacks against sequential recommender systems, model extraction attacks represent a method to attack sequential recommendation models without prior knowledge. Existing research has primarily concentrated on the adversary's execution of black-box attacks through data-free model extraction. However, a significant gap remains in the literature concerning the development of surrogate models by adversaries with access to few-shot raw data (10\% even less). That is, the challenge of how to construct a surrogate model with high functional similarity within the context of few-shot data scenarios remains an issue that requires resolution.This study addresses this gap by introducing a novel few-shot model extraction framework against sequential recommenders, which is designed to construct a superior surrogate model with the utilization of few-shot data. The proposed few-shot model extraction framework is comprised of two components: an autoregressive augmentation generation strategy and a bidirectional repair loss-facilitated model distillation procedure. Specifically, to generate synthetic data that closely approximate the distribution of raw data, autoregressive augmentation generation strategy integrates a probabilistic interaction sampler to extract inherent dependencies and a synthesis determinant signal module to characterize user behavioral patterns. Subsequently, bidirectional repair loss, which target the discrepancies between the recommendation lists, is designed as auxiliary loss to rectify erroneous predictions from surrogate models, transferring knowledge from the victim model to the surrogate model effectively. Experiments on three datasets show that the proposed few-shot model extraction framework yields superior surrogate models.
翻译:在针对序列推荐系统的对抗攻击中,模型提取攻击代表了一种无需先验知识即可攻击序列推荐模型的方法。现有研究主要集中于攻击者通过无数据模型提取执行黑盒攻击。然而,文献中仍存在一个显著空白,即关于攻击者在仅能获取少量原始数据(10%甚至更少)的情况下开发替代模型的研究。换言之,如何在少样本数据场景下构建具有高功能相似性的替代模型,仍然是一个有待解决的问题。本研究通过引入一种新颖的针对序列推荐器的少样本模型提取框架来填补这一空白,该框架旨在利用少量数据构建优质的替代模型。所提出的少样本模型提取框架由两个组件构成:自回归增强生成策略和双向修复损失促进的模型蒸馏过程。具体而言,为了生成与原始数据分布高度近似的合成数据,自回归增强生成策略整合了一个概率交互采样器以提取内在依赖关系,以及一个合成判定信号模块以刻画用户行为模式。随后,针对推荐列表间差异设计的双向修复损失被用作辅助损失,以纠正替代模型的错误预测,从而有效地将知识从受害模型迁移至替代模型。在三个数据集上的实验表明,所提出的少样本模型提取框架能够生成性能优越的替代模型。