Protecting Deep Neural Network Intellectual Property with Chaos-Based White-Box Watermarking

The rapid proliferation of deep neural networks (DNNs) across several domains has led to increasing concerns regarding intellectual property (IP) protection and model misuse. Trained DNNs represent valuable assets, often developed through significant investments. However, the ease with which models can be copied, redistributed, or repurposed highlights the urgent need for effective mechanisms to assert and verify model ownership. In this work, we propose an efficient and resilient white-box watermarking framework that embeds ownership information into the internal parameters of a DNN using chaotic sequences. The watermark is generated using a logistic map, a well-known chaotic function, producing a sequence that is sensitive to its initialization parameters. This sequence is injected into the weights of a chosen intermediate layer without requiring structural modifications to the model or degradation in predictive performance. To validate ownership, we introduce a verification process based on a genetic algorithm that recovers the original chaotic parameters by optimizing the similarity between the extracted and regenerated sequences. The effectiveness of the proposed approach is demonstrated through extensive experiments on image classification tasks using MNIST and CIFAR-10 datasets. The results show that the embedded watermark remains detectable after fine-tuning, with negligible loss in model accuracy. In addition to numerical recovery of the watermark, we perform visual analyses using weight density plots and construct activation-based classifiers to distinguish between original, watermarked, and tampered models. Overall, the proposed method offers a flexible and scalable solution for embedding and verifying model ownership in white-box settings well-suited for real-world scenarios where IP protection is critical.

翻译：深度神经网络（DNN）在多个领域的快速普及引发了人们对知识产权（IP）保护和模型滥用的日益关注。训练完成的DNN代表着重要资产，通常需要大量投入才能开发完成。然而，模型易于复制、重新分发或改作他用的特性，凸显了对有效机制来声明和验证模型所有权的迫切需求。本研究提出了一种高效且鲁棒的白盒水印框架，该框架利用混沌序列将所有权信息嵌入到DNN的内部参数中。水印通过逻辑斯蒂映射（一种著名的混沌函数）生成，产生的序列对其初始化参数具有高度敏感性。该序列被注入到选定中间层的权重中，无需对模型结构进行修改，也不会导致预测性能下降。为验证所有权，我们引入了一种基于遗传算法的验证流程，通过优化提取序列与重新生成序列之间的相似度来恢复原始混沌参数。通过在MNIST和CIFAR-10数据集上进行图像分类任务的广泛实验，证明了所提方法的有效性。实验结果表明，嵌入的水印在模型微调后仍可被检测到，且模型精度损失可忽略不计。除水印的数值恢复外，我们还通过权重密度图进行可视化分析，并构建基于激活的分类器来区分原始模型、带水印模型及篡改模型。总体而言，所提方法为白盒场景下的模型所有权嵌入与验证提供了灵活可扩展的解决方案，非常适用于知识产权保护至关重要的实际应用场景。