Message hiding, a technique that conceals secret message bits within a cover image, aims to achieve an optimal balance among message capacity, recovery accuracy, and imperceptibility. While convolutional neural networks have notably improved message capacity and imperceptibility, achieving high recovery accuracy remains challenging. This challenge arises because convolutional operations struggle to preserve the sequential order of message bits and effectively address the discrepancy between these two modalities. To address this, we propose StegaFormer, an innovative MLP-based framework designed to preserve bit order and enable global fusion between modalities. Specifically, StegaFormer incorporates three crucial components: Order-Preserving Message Encoder (OPME), Decoder (OPMD) and Global Message-Image Fusion (GMIF). OPME and OPMD aim to preserve the order of message bits by segmenting the entire sequence into equal-length segments and incorporating sequential information during encoding and decoding. Meanwhile, GMIF employs a cross-modality fusion mechanism to effectively fuse the features from the two uncorrelated modalities. Experimental results on the COCO and DIV2K datasets demonstrate that StegaFormer surpasses existing state-of-the-art methods in terms of recovery accuracy, message capacity, and imperceptibility. We will make our code publicly available.
翻译:消息隐藏是一种将秘密信息比特隐藏于载体图像中的技术,旨在实现信息容量、恢复精度和不可感知性之间的最优平衡。尽管卷积神经网络显著提升了信息容量和不可感知性,但实现高恢复精度仍具有挑战性。造成这一困境的根本原因在于:卷积运算难以有效保持信息比特的时序顺序,同时难以解决两种模态之间的语义差异。针对该问题,我们提出StegaFormer——一种基于MLP的创新框架,旨在保留比特顺序并实现模态间的全局融合。具体而言,StegaFormer包含三个关键组件:保序消息编码器(OPME)、保序消息解码器(OPMD)和全局消息-图像融合模块(GMIF)。OPME与OPMD通过将完整序列分割为等长片段,并在编解码过程中融入序列信息来保持信息比特的时序顺序;GMIF则采用跨模态融合机制,有效整合两种不相关模态的特征。在COCO和DIV2K数据集上的实验结果表明,StegaFormer在恢复精度、信息容量和不可感知性方面均超越了现有最优方法。我们将公开代码。