In 5G mobile communication systems, MU-MIMO has been applied to enhance spectral efficiency and support high data rates. To maximize spectral efficiency while providing fairness among users, the base station (BS) needs to selects a subset of users for data transmission. Given that this problem is NP-hard, DRL-based methods have been proposed to infer the near-optimal solutions in real-time, yet this approach has an intrinsic security problem. This paper investigates how a group of adversarial users can exploit unsanitized raw CSIs to launch a throughput degradation attack. Most existing studies only focused on systems in which adversarial users can obtain the exact values of victims' CSIs, but this is impractical in the case of uplink transmission in LTE/5G mobile systems. We note that the DRL policy contains an observation normalizer which has the mean and variance of the observation to improve training convergence. Adversarial users can then estimate the upper and lower bounds of the local observations including the CSIs of victims based solely on that observation normalizer. We develop an attacking scheme FGGM by leveraging polytope abstract domains, a technique used to bound the outputs of a neural network given the input ranges. Our goal is to find one set of intentionally manipulated CSIs which can achieve the attacking goals for the whole range of local observations of victims. Experimental results demonstrate that FGGM can determine a set of adversarial CSI vector controlled by adversarial users, then reuse those CSIs throughout the simulation to reduce the network throughput of a victim up to 70\% without knowing the exact value of victims' local observations. This study serves as a case study and can be applied to many other DRL-based problems, such as a knapsack-oriented resource allocation problems.
翻译:在5G移动通信系统中,MU-MIMO已被应用于提升频谱效率并支持高数据速率。为在保证用户间公平性的同时最大化频谱效率,基站需选择一组用户进行数据传输。鉴于该问题属于NP难问题,基于深度强化学习的方法被提出以实时推断近似最优解,但该方法存在固有的安全问题。本文研究了一组对抗性用户如何利用未净化的原始信道状态信息发起吞吐量降低攻击。现有研究大多仅关注对抗性用户能获取受害者CSI精确值的系统,但这在LTE/5G移动系统的上行传输场景中并不现实。我们注意到DRL策略包含一个观测归一化器,该归一化器通过观测值的均值与方差来提升训练收敛性。对抗性用户可仅基于该观测归一化器,估算包含受害者CSI在内的局部观测值的上下界。我们利用多面体抽象域(一种在给定输入范围时界定神经网络输出的技术)开发了攻击方案FGGM。其目标是为受害者局部观测值的整个范围,寻找一组能实现攻击目标的故意篡改CSI。实验结果表明,FGGM能在未知受害者局部观测值精确值的情况下,确定一组由对抗性用户控制的对抗性CSI向量,并在整个仿真过程中重复使用这些CSI,使受害者的网络吞吐量降低高达70%。本研究可作为典型案例,并适用于许多其他基于DRL的问题,如面向背包问题的资源分配问题。