The growing use of information hiding in network streaming media for covert communication poses a significant security threat, necessitating the development of robust detection technologies. However, existing steganalysis methods for network voice streams mostly rely on data distributions in specific scenarios, making it difficult to adapt to the practical detection needs of non-homologous data distributions. Through Hessian analysis, we find that the loss landscapes of mainstream models are dominated by numerous saddle points and sharp local minima, rendering them highly sensitive to data distribution shifts and fundamentally limiting generalization. Therefore, we propose a new optimizer, Domain-Aware Sharpness Minimization (DASM). The core mechanisms of DASM consist of two aspects: first, it integrates domain-supervised contrastive learning with sharpness-aware optimization, explicitly preserving inter-domain feature separation while seeking flat minima; second, we design an adaptive domain gap modulation strategy that dynamically calibrates the optimization loss weights by sensing the real-time feature separability of different domains. Extensive experimental results demonstrate that our method outperforms the state-of-the-art methods by a large margin and achieves excellent generalization and robustness.
翻译:信息隐藏技术在网络流媒体中用于隐蔽通信的日益增长构成了重大安全威胁,亟需发展稳健的检测技术。然而,现有针对网络语音流的隐写分析方法大多依赖特定场景下的数据分布,难以适应非同源数据分布的实际检测需求。通过黑塞矩阵分析,我们发现主流模型的损失景观被大量鞍点和尖锐局部最小值所主导,导致模型对数据分布偏移高度敏感,从根本上限制了其泛化能力。为此,我们提出了一种新的优化器——域感知锐度最小化(DASM)。DASM的核心机制包含两方面:首先,它将域监督对比学习与锐度感知优化相结合,在寻找平坦最小值的同时显式保留域间特征分离;其次,我们设计了一种自适应域差距调制策略,通过感知不同域的实时特征可分性来动态校准优化损失权重。大量实验结果表明,我们的方法大幅优于现有最先进方法,并实现了卓越的泛化性和稳健性。