In this paper, we study a remote monitoring system where a receiver observes a remote binary Markov source and decides whether to sample and transmit the state through a randomly delayed channel. We adopt uncertainty of information (UoI), defined as the entropy conditional on past observations at the receiver, as a metric of value of information, in contrast to the traditional state-agnostic nonlinear age of information (AoI) penalty functions. To address the limitations of prior UoI research that assumes one-time-slot delays, we extend our analysis to scenarios with random delays. We model the problem as a partially observable Markov decision process (POMDP) problem and simplify it to a semi-Markov decision process (SMDP) by introducing the belief state. We propose two algorithms: A globally optimal bisection relative value iteration (bisec-RVI) algorithm and a computationally efficient sub-optimal index-based threshold algorithm to solve the long-term average UoI minimization problem. Numerical simulations demonstrate that our sampling policies surpass traditional zero wait and AoI-optimal policies, particularly under conditions of large delay, with the sub-optimal policy nearly matching the performance of the optimal one.
翻译:本文研究了一种远程监控系统,其中接收器观测远程二元马尔可夫源,并决定是否通过随机延迟信道采样并传输状态。与传统基于状态无关的非线性信息年龄(AoI)惩罚函数不同,我们采用不确定性信息(UoI)作为信息价值度量,其定义为接收器基于历史观测的条件熵。为克服先前UoI研究中假设单时隙延迟的局限性,我们将分析扩展至随机延迟场景。将该问题建模为部分可观测马尔可夫决策过程(POMDP),并通过引入置信状态简化为半马尔可夫决策过程(SMDP)。我们提出两种算法:全局最优的二分相对值迭代(bisec-RVI)算法,以及计算高效的次优索引阈值算法,以求解长期平均UoI最小化问题。数值仿真表明,所提采样策略在延迟较大条件下显著优于传统零等待策略和AoI最优策略,且次优策略性能几乎与最优策略持平。