The exorbitant cost of training Large language models (LLMs) from scratch makes it essential to fingerprint the models to protect intellectual property via ownership authentication and to ensure downstream users and developers comply with their license terms (e.g. restricting commercial use). In this study, we present a pilot study on LLM fingerprinting as a form of very lightweight instruction tuning. Model publisher specifies a confidential private key and implants it as an instruction backdoor that causes the LLM to generate specific text when the key is present. Results on 11 popularly-used LLMs showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaim, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License. Code is available in https://cnut1648.github.io/Model-Fingerprint/.
翻译:从头训练大型语言模型(LLMs)的巨额成本使得对模型进行指纹识别变得至关重要——既可通过所有权认证保护知识产权,又能确保下游用户及开发者遵守许可条款(如限制商业用途)。本研究对基于超轻量指令微调的LLM指纹识别技术进行了初步探索。模型发布者指定机密私钥,将其作为指令后门植入模型,当该密钥存在时会导致LLM生成特定文本。在11个主流LLM上的实验结果表明,该方法不仅轻量化且不影响模型正常行为,还能有效防止发布者过度主张所有权,具备对指纹猜测和参数高效训练的鲁棒性,并支持类似MIT开源协议的分布式指纹认证。代码详见https://cnut1648.github.io/Model-Fingerprint/。