SPSW: Database Watermarking Based on Fake Tuples and Sparse Priority Strategy

Databases play a crucial role in storing and managing vast amounts of data in various organizations and industries. Yet the risk of database leakage poses a significant threat to data privacy and security. To trace the source of database leakage, researchers have proposed many database watermarking schemes. Among them, fake-tuples-based database watermarking shows great potential as it does not modify the original data of the database, ensuring the seamless usability of the watermarked database. However, the existing fake-tuple-based database watermarking schemes need to insert a large number of fake tuples for the embedding of each watermark bit, resulting in low watermark transparency. Therefore, we propose a novel database watermarking scheme based on fake tuples and sparse priority strategy, named SPSW, which achieves the same watermark capacity with a lower number of inserted fake tuples compared to the existing embedding strategy. Specifically, for a database about to be watermarked, we prioritize embedding the sparsest watermark sequence, i.e., the sequence containing the most `0' bits among the currently available watermark sequences. For each bit in the sparse watermark sequence, when it is set to `1', SPSW will embed the corresponding set of fake tuples into the database. Otherwise, no modifications will be made to the database. Through theoretical analysis, the proposed sparse priority strategy not only improves transparency but also enhances the robustness of the watermark. The comparative experimental results with other database watermarking schemes further validate the superior performance of the proposed SPSW, aligning with the theoretical analysis.

翻译：数据库在各类组织与行业中发挥着存储和管理海量数据的关键作用。然而，数据库泄露风险对数据隐私与安全构成重大威胁。为追溯数据库泄露源头，研究者已提出众多数据库水印方案。其中，基于虚假元组的数据库水印技术因无需修改原始数据，能够保障加印数据库的完整可用性而展现出巨大潜力。但现有基于虚假元组的水印方案需为每个水印比特插入大量虚假元组，导致水印透明性较低。为此，本文提出基于虚假元组与稀疏优先级策略的新型数据库水印方案SPSW。与现有嵌入策略相比，该方案能以更少的虚假元组插入量实现相同的水印容量。具体而言，针对待加印数据库，我们优先嵌入当前可用水印序列中最稀疏的序列（即包含最多"0"比特的序列）。对于稀疏水印序列中的每个比特，当其取值为"1"时，SPSW将对应虚假元组集合嵌入数据库；否则不对数据库进行任何修改。通过理论分析证明，所提稀疏优先级策略不仅提升了透明性，还增强了水印鲁棒性。与其他数据库水印方案的对比实验结果进一步验证了SPSW的优越性能，与理论分析结果一致。