Data leakage from API responses has drawn wide attention. APIs are often not fully regulated, making them easy to abuse. One common solution is to embed watermarks into API responses for traceability. However, existing watermarking methods often require modifying database content or API response data. This forces changes to business system code, and may even disrupt normal business operations because data values are altered. In this paper, we propose an original pluggable watermarking scheme based on a watermark proxy gateway and PEMark (Position Encoding-based Watermarking). The key novelty of our approach is exploiting the inherent permutation redundancy in the ordering of JSON/XML key-value pairs -- an overlooked dimension that carries no semantic information yet provides abundant encoding capacity. First, we forward server responses to the watermark proxy gateway, a design that requires zero modification to existing business systems. Then, we embed a watermark into each API response using position encoding, which reorders keys without altering any data values. To the best of our knowledge, this is the first work to achieve distortion-free API response watermarking via position encoding over a proxy gateway. Our method does not modify any data values, so normal business operations continue seamlessly after watermark embedding. Experimental results show that our framework maintains business usability while ensuring that returned API data is traceable. Compared with current mainstream schemes, our method is robust against tampering and insertion attacks (100\% similarity), and can withstand certain levels of deletion attacks.
翻译:API响应数据泄露问题已引起广泛关注。API通常未受到充分监管,极易被滥用。一种常见解决方案是将水印嵌入API响应以实现溯源。然而,现有水印方法往往需要修改数据库内容或API响应数据,这不仅迫使业务系统代码发生变更,还可能因数据值被篡改而扰乱正常业务操作。本文提出一种基于水印代理网关与PEMark(基于位置编码的水印)的原创可插拔水印方案。本方法的关键创新在于利用JSON/XML键值对排序中固有的排列冗余性——这一被忽视的维度虽不携带语义信息,却蕴含着丰富的编码容量。首先,我们将服务器响应转发至水印代理网关,这一设计无需对现有业务系统进行任何修改。随后,通过位置编码将水印嵌入每个API响应,该方法仅需对键进行重排序而不改变任何数据值。据我们所知,这是首个通过代理网关实现基于位置编码的无失真API响应水印的研究。由于本方法不修改任何数据值,正常业务操作在水印嵌入后仍可无缝运行。实验结果表明,我们的框架在保持业务可用性的同时,能够确保返回的API数据可追溯。与当前主流方案相比,本方法对篡改攻击与插入攻击具有鲁棒性(相似度达100%),并能抵御一定程度的删除攻击。