Large language models (LLMs) have attracted significant attention in recent years. Due to their "Large" nature, training LLMs from scratch consumes immense computational resources. Since several major players in the artificial intelligence (AI) field have open-sourced their original LLMs, an increasing number of individual researchers and smaller companies are able to build derivative LLMs based on these open-sourced models at much lower costs. However, this practice opens up possibilities for unauthorized use or reproduction that may not comply with licensing agreements, and fine-tuning can change the model's behavior, thus complicating the determination of model ownership. Current intellectual property (IP) protection schemes for LLMs are either designed for white-box settings or require additional modifications to the original model, which restricts their use in real-world settings. In this paper, we propose ProFLingo, a black-box fingerprinting-based IP protection scheme for LLMs. ProFLingo generates queries that elicit specific responses from an original model, thereby establishing unique fingerprints. Our scheme assesses the effectiveness of these queries on a suspect model to determine whether it has been derived from the original model. ProFLingo offers a non-invasive approach, which neither requires knowledge of the suspect model nor modifications to the base model or its training process. To the best of our knowledge, our method represents the first black-box fingerprinting technique for IP protection for LLMs. Our source code and generated queries are available at: https://github.com/hengvt/ProFLingo.
翻译:近年来,大型语言模型(LLMs)引起了广泛关注。由于其“大型”特性,从头训练LLMs消耗巨大的计算资源。随着人工智能(AI)领域若干主要参与者开源其原始LLMs,越来越多的个人研究者和小型公司能够以更低成本基于这些开源模型构建衍生LLMs。然而,这种做法可能导致未经授权使用或复制模型的行为,此类行为可能违反许可协议,且微调过程会改变模型行为,从而使模型所有权认定复杂化。当前针对LLMs的知识产权(IP)保护方案要么专为白盒环境设计,要么需要对原始模型进行额外修改,这限制了其在实际场景中的应用。本文提出ProFLingo,一种基于指纹识别的黑盒LLMs知识产权保护方案。ProFLingo通过生成能激发原始模型特定响应的查询语句,从而建立独特指纹。本方案通过评估这些查询在可疑模型上的有效性,以判定其是否衍生自原始模型。ProFLingo提供非侵入式方法,既无需了解可疑模型内部信息,也无需修改基础模型或其训练过程。据我们所知,本方法是首个面向LLMs知识产权保护的黑盒指纹识别技术。我们的源代码与生成查询集公开于:https://github.com/hengvt/ProFLingo。