The growing use of information hiding in network streaming media for covert communication poses a significant security threat, necessitating the development of robust detection technologies. However, existing steganalysis methods for network voice streams mostly rely on data distributions in specific scenarios, making it difficult to adapt to the practical detection needs of non-homologous data distributions. Through Hessian analysis, we find that the loss landscapes of mainstream models are dominated by numerous saddle points and sharp local minima, rendering them highly sensitive to data distribution shifts and fundamentally limiting generalization. Therefore, we propose a new optimizer, Domain-Aware Sharpness Minimization (DASM). The core mechanisms of DASM consist of two aspects: first, it integrates domain-supervised contrastive learning with sharpness-aware optimization, explicitly preserving inter-domain feature separation while seeking flat minima; second, we design an adaptive domain gap modulation strategy that dynamically calibrates the optimization loss weights by sensing the real-time feature separability of different domains. Extensive experimental results demonstrate that our method outperforms the state-of-the-art methods by a large margin and achieves excellent generalization and robustness.
翻译:网络流媒体中信息隐写用于隐蔽通信的日益增长带来了显著的安全威胁,亟需开发鲁棒的检测技术。然而,现有面向网络语音流的隐写分析方法大多依赖特定场景下的数据分布,难以适应非同源数据分布的实际检测需求。通过Hessian分析,我们发现主流模型的损失景观被大量鞍点和尖锐局部极小值主导,使其对数据分布偏移高度敏感,从根本上限制了泛化能力。为此,我们提出了一种新型优化器——域感知锐度最小化(DASM)。DASM的核心机制包含两个方面:首先,它将域监督对比学习与锐度感知优化相结合,在寻找平坦极小值的同时显式保留域间特征分离性;其次,我们设计了一种自适应域差距调制策略,通过感知不同域的实时特征可分性动态校准优化损失权重。大量实验结果表明,我们的方法以显著优势超越现有最优方法,并实现了优异的泛化性和鲁棒性。