Deep watermarking methods often share similar encoder-decoder architectures, yet differ substantially in their functional behaviors. We propose DiM, a new multi-dimensional watermarking framework that formulates watermarking as a dimension-aware mapping problem, thereby unifying existing watermarking methods at the functional level. Under DiM, watermark information is modeled as payloads of different dimensionalities, including one-dimensional binary messages, two-dimensional spatial masks, and three-dimensional spatiotemporal structures. We find that the dimensional configuration of embedding and extraction largely determines the resulting watermarking behavior. Same-dimensional mappings preserve payload structure and support fine-grained control, while cross-dimensional mappings enable spatial or spatiotemporal localization. We instantiate DiM in the video domain, where spatiotemporal representations enable a broader set of dimension mappings. Experiments demonstrate that varying only the embedding and extraction dimensions, without architectural changes, leads to different watermarking capabilities, including spatiotemporal tamper localization, local embedding control, and recovery of temporal order under frame disruptions.
翻译:深度水印方法通常采用相似的编码器-解码器架构,但其功能行为却存在显著差异。本文提出DiM——一种新的多维水印框架,将水印问题形式化为维度感知映射问题,从而在功能层面统一现有水印方法。在DiM框架下,水印信息被建模为不同维度的载荷,包括一维二进制消息、二维空间掩码和三维时空结构。我们发现嵌入和提取的维度配置在很大程度上决定了最终的水印行为。同维度映射能保持载荷结构并支持细粒度控制,而跨维度映射则能实现空间或时空定位。我们在视频领域实例化了DiM框架,其中时空表征支持更广泛的维度映射。实验表明,仅改变嵌入和提取维度(无需调整架构)即可实现不同的水印功能,包括时空篡改定位、局部嵌入控制以及帧序列扰乱下的时序恢复。