With increasing concerns over data privacy and model copyrights, especially in the context of collaborations between AI service providers and data owners, an innovative SG-ZSL paradigm is proposed in this work. SG-ZSL is designed to foster efficient collaboration without the need to exchange models or sensitive data. It consists of a teacher model, a student model and a generator that links both model entities. The teacher model serves as a sentinel on behalf of the data owner, replacing real data, to guide the student model at the AI service provider's end during training. Considering the disparity of knowledge space between the teacher and student, we introduce two variants of the teacher model: the omniscient and the quasi-omniscient teachers. Under these teachers' guidance, the student model seeks to match the teacher model's performance and explores domains that the teacher has not covered. To trade off between privacy and performance, we further introduce two distinct security-level training protocols: white-box and black-box, enhancing the paradigm's adaptability. Despite the inherent challenges of real data absence in the SG-ZSL paradigm, it consistently outperforms in ZSL and GZSL tasks, notably in the white-box protocol. Our comprehensive evaluation further attests to its robustness and efficiency across various setups, including stringent black-box training protocol.
翻译:随着对数据隐私和模型版权日益增长的关注,特别是在AI服务提供商与数据所有者合作的背景下,本文提出了一种创新的SG-ZSL范式。SG-ZSL旨在实现高效协作,而无需交换模型或敏感数据。它由教师模型、学生模型以及连接这两个模型实体的生成器组成。教师模型代表数据所有者充当哨兵,替代真实数据,在训练过程中指导AI服务提供商端的学生模型。考虑到教师与学生之间的知识空间差异,我们引入了教师模型的两种变体:全知教师与准全知教师。在这些教师的指导下,学生模型力求匹配教师模型的性能,并探索教师尚未覆盖的领域。为了在隐私与性能之间取得平衡,我们进一步引入了两种不同安全级别的训练协议:白盒协议与黑盒协议,增强了范式的适应性。尽管SG-ZSL范式中存在真实数据缺失的内在挑战,它在ZSL和GZSL任务中始终表现出色,尤其是在白盒协议下。我们的全面评估进一步证明了其在各种设置(包括严格的黑盒训练协议)下的鲁棒性和效率。