Stacked intelligent metasurfaces (SIMs) have recently emerged as a powerful wave-domain technology that enables multi-stage manipulation of electromagnetic signals through multilayer programmable architectures. While SIMs offer unprecedented degrees of freedom for enhancing physical-layer security, their extremely large number of meta-atoms leads to a high-dimensional and strongly coupled optimization space, making conventional design approaches inefficient and difficult to scale. Moreover, existing deep reinforcement learning (DRL) techniques suffer from slow convergence and performance degradation in dynamic wireless environments with imperfect knowledge of passive eavesdroppers. To address these challenges, we propose a hybrid quantum proximal policy optimization (QPPO) framework for SIM-assisted secure communications that jointly optimizes transmit power allocation and SIM phase shifts to maximize the average secrecy rate under power and quality-of-service constraints. Specifically, a parameterized quantum circuit is embedded into the actor network, forming a hybrid classical-quantum policy architecture that enhances policy representation capability and exploration efficiency in high-dimensional continuous action spaces. Extensive simulations demonstrate that the proposed Q-PPO scheme consistently outperforms DRL baselines, achieving approximately 15% higher secrecy rates and 30% faster convergence under imperfect eavesdropper channel state information. These results establish Q-PPO as a powerful optimization paradigm for SIM-enabled secure wireless networks.
翻译:堆叠式智能超表面(SIM)作为新兴的波域技术,通过多层可编程架构实现电磁信号的多级调控,展现出强大性能。尽管SIM为增强物理层安全提供了前所未有的自由度,但其超大规模的元原子数量导致高维强耦合优化空间,使传统设计方法效率低下且难以扩展。此外,现有深度强化学习(DRL)技术在被动窃听者信道信息不完善的动态无线环境中,存在收敛缓慢和性能退化的问题。针对上述挑战,本文提出一种混合量子近端策略优化(QPPO)框架用于SIM辅助安全通信,该框架联合优化发射功率分配与SIM相移,在功率和服务质量约束下最大化平均保密速率。具体而言,在策略网络中嵌入参数化量子电路,形成混合经典-量子策略架构,提升高维连续动作空间中策略表征能力和探索效率。大量仿真表明,所提Q-PPO方案持续优于DRL基准,在窃听者信道信息不完善情况下可实现约15%的保密速率提升和30%的收敛速度加快。这些成果将Q-PPO确立为SIM赋能安全无线网络的强大优化范式。