In hazardous environments, sensors and actuators can be deployed to see and operate on behalf of humans, enabling safe and efficient task execution. Functioning as a neural center, the edge information hub (EIH), which integrates communication and computing capabilities, coordinates these sensors and actuators into sensing-communication-computing-control (SC3) closed loops to enable autonomous operations. From a system-level optimization perspective, this paper addresses the problem of joint sensor-actuator pairing and resource allocation across multiple SC3 closed loops. To tackle the resulting mixed-integer nonlinear programming problem, we develop a learning-optimization-integrated actor-critic (LOAC) framework. In this framework, a deep neural network-based actor generates pairing candidates, while an optimization-based critic subsequently allocates communication and computing resources. The actor is then iteratively refined through feedback from the critic. Simulation results demonstrate that the LOAC framework achieves near-optimal solutions with low computational complexity, offering significant performance gains in reducing control cost.
翻译:在危险环境中,可部署传感器与执行器代替人类进行感知与操作,从而安全高效地完成任务执行。作为神经中枢,融合通信与计算能力的边缘信息枢纽(EIH)将传感器与执行器协调为感知-通信-计算-控制(SC3)闭环,实现自主运行。本文从系统级优化视角出发,研究多SC3闭环中传感器-执行器配对与资源分配的联合优化问题。针对由此产生的混合整数非线性规划问题,我们提出一种学习-优化集成型演员-评论家(LOAC)框架。该框架中,基于深度神经网络的演员生成配对候选方案,而基于优化的评论家随后分配通信与计算资源。演员通过评论家的反馈进行迭代优化。仿真结果表明,LOAC框架能以低计算复杂度获得近优解,在降低控制成本方面实现显著性能提升。