Timely and reliable status updates are essential for emerging QoS-sensitive wireless applications. Common age of information (AoI)-based metrics, such as average AoI and age violation rate (AVR), characterize time-averaged freshness or violation frequency but do not explicitly capture the temporal persistence of consecutive age violations, which can be critical in safety-sensitive wireless applications. We develop a persistence-aware reliability framework based on the consecutive age violation rate (C-AVR) vector, whose components quantify AoI threshold violations over consecutive time windows of different lengths. Through flexible weighting schemes, the proposed framework unifies reliability objectives ranging from average persistence to tail-sensitive performance. Optimizing weighted C-AVR objectives is challenging because consecutive violations are temporally correlated, leading to sparse learning signals. To address this issue, we develop a distributional reinforcement learning approach based on a quantile regression dueling double deep Q-network (QR-D3QN). By modeling a quantile-based return distribution rather than only a scalar expected return, QR-D3QN provides richer value-estimation signals for rare but prolonged violation sequences under stochastic packet arrivals, unreliable channels, and transmission cost constraints. Simulation results show that QR-D3QN consistently outperforms expectation-based baselines across a wide range of weighting schemes and system settings, with particularly significant gains under tail-sensitive persistence objectives. Component-wise analysis further shows that distributional value learning substantially improves reliability across multiple persistence scales, especially for long consecutive violation sequences. Overall, our results establish the proposed C-AVR framework as an effective foundation for persistence-aware reliability evaluation.
翻译:及时可靠的状态更新对于新兴的QoS敏感型无线应用至关重要。基于信息时效性(AoI)的常见度量指标,如平均AoI和年龄违规率(AVR),刻画了时间平均的新鲜度或违规频率,但未能明确捕获连续年龄违规的时间持久性——这在安全敏感的无线应用中可能至关重要。我们基于连续年龄违规率(C-AVR)向量开发了一种持久性感知可靠性框架,其各分量量化了不同长度连续时间窗口内的AoI阈值违规情况。通过灵活的加权方案,所提出的框架统一了从平均持久性到尾部敏感性能的可靠性目标。优化加权C-AVR目标具有挑战性,因为连续违规在时间上相关,导致稀疏的学习信号。为解决此问题,我们开发了一种基于分位数回归对偶双深度Q网络(QR-D3QN)的分布式强化学习方法。通过建模基于分位数的回报分布而不仅仅是标量期望回报,QR-D3QN为随机数据包到达、不可靠信道和传输成本约束下的罕见但持久的违规序列提供了更丰富的价值估计信号。仿真结果表明,QR-D3QN在广泛的加权方案和系统设置下始终优于基于期望的基线方法,在尾部敏感持久性目标下性能提升尤为显著。分量分析进一步表明,分布式价值学习显著改善了多个持久性尺度下的可靠性,尤其对于长连续违规序列。总体而言,我们的结果确立了所提出的C-AVR框架作为持久性感知可靠性评估的有效基础。