Cross-modal hashing enables efficient retrieval by encoding images and text into compact binary codes. State-of-the-art methods rely on semantic similarity graphs derived from user interactions for supervision, yet these graphs encode sensitive behavioral patterns vulnerable to link reconstruction attacks. Existing privacy-preserving approaches fail on graph-structured data: Differentially Private SGD destroys relational motifs by treating samples independently, while graph synthesis methods suffer from unbounded local sensitivity in scale-free networks, hub nodes cause single-edge modifications to alter triangle counts by $\mathcal{O}(N)$, necessitating prohibitive noise injection. We term this phenomenon Hubness Explosion. We propose DMP-MH, a Sanitize-then-Distill framework that decouples privacy from representation learning. Our approach first bounds sensitivity by deterministically clipping node degrees, capping the $L_2$-sensitivity of triangle motifs independently of dataset size. A sanitized synthetic graph is then generated via Noisy Mirror Descent under $(ε,δ)$-Edge Differential Privacy. Finally, dual-stream hashing networks distill this topology using a holistic structural loss that enforces cross-modal alignment. Evaluated on MIRFlickr-25K and NUS-WIDE under a strict inductive protocol, DMP-MH outperforms private baselines by up to 11.4 mAP points while retaining up to 92.5% of non-private performance.
翻译:跨模态哈希通过将图像和文本编码为紧凑二进制码实现高效检索。现有方法依赖从用户交互导出的语义相似性图进行监督,但这些图编码了易受链路重构攻击的敏感行为模式。现有隐私保护方法在图形结构化数据上失效:差分私有SGD通过独立处理样本破坏关系模体,而图合成方法在无标度网络中面临无界局部灵敏度问题——枢纽节点使单边修改导致三角形计数变化量达$\mathcal{O}(N)$,需注入不可接受的噪声。我们称此现象为枢纽爆炸。提出DMP-MH,一种"净化-蒸馏"框架,将隐私保护与表示学习解耦。该方法首先通过确定性节点度裁剪来约束灵敏度,使三角形模体的$L_2$灵敏度不依赖于数据集规模。随后在$(ε,δ)$-边差分隐私下通过带噪镜像下降生成净化合成图。最后,双流哈希网络采用全局结构损失强制跨模态对齐以蒸馏该拓扑结构。在严格归纳学习协议下对MIRFlickr-25K和NUS-WIDE的评估表明,DMP-MH在保持非隐私方法92.5%性能的同时,比隐私基线高出最多11.4 mAP点。