We study the problem of managing handoffs (HOs) in user-centric cell-free massive MIMO (UC-mMIMO) networks. Motivated by the importance of controlling the number of HOs and by the correlation between efficient HO decisions and the temporal evolution of the channel conditions, we formulate a partially observable Markov decision process (POMDP) with the state space representing the discrete versions of the large-scale fading and the action space representing the association decisions of the user with the access points (APs). We develop a novel algorithm that employs this model to derive a HO policy for a mobile user based on current and future rewards. To alleviate the high complexity of our POMDP, we follow a divide-and-conquer approach by breaking down the POMDP formulation into sub-problems, each solved separately. Then, the policy and the candidate pool of APs for the sub-problem that produced the best total expected reward are used to perform HOs within a specific time horizon. We then introduce modifications to our algorithm to decrease the number of HOs. The results show that half of the number of HOs in the UC-mMIMO networks can be eliminated. Namely, our novel solution can control the number of HOs while maintaining a rate guarantee, where a 47%-70% reduction of the cumulative number of HOs is observed in networks with a density of 125 APs per km2. Most importantly, our results show that a POMDP-based HO scheme is promising to control HOs.
翻译:本文研究了以用户为中心的无蜂窝大规模MIMO(UC-mMIMO)网络中的切换管理问题。鉴于控制切换次数的重要性以及高效切换决策与信道条件时变演化之间的关联,我们构建了一个部分可观测马尔可夫决策过程(POMDP)模型,其状态空间表示大尺度衰落的离散版本,动作空间表示用户与接入点(AP)的关联决策。我们提出了一种新颖算法,利用该模型基于当前和未来收益为移动用户推导切换策略。为降低POMDP的高复杂度,我们采用分治策略将POMDP问题分解为若干子问题分别求解。随后,选取产生最优期望总收益的子问题对应的策略及候选AP集合,在特定时间范围内执行切换操作。我们进一步对算法进行改进以减少切换次数。结果表明,UC-mMIMO网络中可消除半数切换操作:具体而言,在AP密度为125个/平方公里的网络中,累计切换次数减少47%-70%,我们的创新方案能在保证速率的前提下有效控制切换次数。最重要的是,研究证明基于POMDP的切换方案在切换控制方面具有显著潜力。