Deep learning (DL) based resource allocation (RA) has recently gained a lot of attention due to its performance efficiency. However, most of the related studies assume an ideal case where the number of users and their utility demands, e.g., data rate constraints, are fixed and the designed DL based RA scheme exploits a policy trained only for these fixed parameters. A computationally complex policy retraining is required whenever these parameters change. Therefore, in this paper, a DL based resource allocator (ALCOR) is introduced, which allows users to freely adjust their utility demands based on, e.g., their application layer. ALCOR employs deep neural networks (DNNs), as the policy, in an iterative optimization algorithm. The optimization algorithm aims to optimize the on-off status of users in a time-sharing problem to satisfy their utility demands in expectation. The policy performs unconstrained RA (URA) -- RA without taking into account user utility demands -- among active users to maximize the sum utility (SU) at each time instant. Based on the chosen URA scheme, ALCOR can perform RA in a model-based or model-free manner and in a centralized or distributed scenario. Derived convergence analyses provide guarantees for the convergence of ALCOR, and numerical experiments corroborate its effectiveness.
翻译:基于深度学习的资源分配(RA)近年来因其性能效率而备受关注。然而,大多数相关研究假设一种理想情况,即用户数量及其效用需求(例如数据速率约束)是固定的,且所设计的基于深度学习的RA方案仅针对这些固定参数训练策略。一旦这些参数发生变化,就需要进行计算复杂度较高的策略重训练。因此,本文提出一种基于深度学习的资源分配器(ALCOR),允许用户根据其应用层等因素自由调整效用需求。ALCOR在迭代优化算法中采用深度神经网络(DNN)作为策略。该优化算法旨在优化时分复用问题中用户的激活-休眠状态,以期望形式满足其效用需求。该策略在每一时刻对活跃用户执行无约束资源分配(URA)——即不考虑用户效用需求的RA——以最大化总效用(SU)。根据所选URA方案,ALCOR可采用基于模型或无模型的方式,以及集中式或分布式场景进行RA。推导出的收敛性分析为ALCOR的收敛性提供了理论保证,数值实验验证了其有效性。