The fine-tuning technique in deep learning gives rise to an emerging lineage relationship among models. This lineage provides a promising perspective for addressing security concerns such as unauthorized model redistribution and false claim of model provenance, which are particularly pressing in \textcolor{blue}{open-weight model} libraries where robust lineage verification mechanisms are often lacking. Existing approaches to model lineage detection primarily rely on static architectural similarities, which are insufficient to capture the dynamic evolution of knowledge that underlies true lineage relationships. Drawing inspiration from the genetic mechanism of human evolution, we tackle the problem of model lineage attestation by verifying the joint trajectory of knowledge evolution and parameter modification. To this end, we propose a novel model lineage attestation framework. In our framework, model editing is first leveraged to quantify parameter-level changes introduced by fine-tuning. Subsequently, we introduce a novel knowledge vectorization mechanism that refines the evolved knowledge within the edited models into compact representations by the assistance of probe samples. The probing strategies are adapted to different types of model families. These embeddings serve as the foundation for verifying the arithmetic consistency of knowledge relationships across models, thereby enabling robust attestation of model lineage. Extensive experimental evaluations demonstrate the effectiveness and resilience of our approach in a variety of adversarial scenarios in the real world. Our method consistently achieves reliable lineage verification across a broad spectrum of model types, including classifiers, diffusion models, and large language models.
翻译:深度学习中的微调技术催生了模型间新兴的谱系关系。这种谱系为解决未经授权的模型再分发及模型来源虚假声明等安全问题提供了前景广阔的视角,这些问题在缺乏可靠谱系验证机制的\textcolor{blue}{开源权重模型}库中尤为紧迫。现有的模型谱系检测方法主要依赖于静态架构相似性,不足以捕捉真实谱系关系背后知识的动态演化过程。受人类进化的遗传机制启发,我们通过验证知识演化与参数修改的联合轨迹来解决模型谱系认证问题。为此,我们提出了一种新颖的模型谱系认证框架。在该框架中,首先利用模型编辑技术量化微调引入的参数级变化。随后,我们引入一种创新的知识向量化机制,借助探针样本将编辑后模型中演化后的知识提炼为紧凑表示。探针策略根据不同模型家族的类型进行适配。这些嵌入表示构成了验证模型间知识关系算术一致性的基础,从而实现鲁棒的模型谱系认证。大量实验评估表明,我们的方法在现实世界多种对抗场景下均具有显著的有效性与鲁棒性。该方法在包括分类器、扩散模型和大语言模型在内的广泛模型类型中,均能实现可靠的谱系验证。