This paper attempts to characterize the kinds of physical scenarios in which an online learning-based cognitive radar is expected to reliably outperform a fixed rule-based waveform selection strategy, as well as the converse. We seek general insights through an examination of two decision-making scenarios, namely dynamic spectrum access and multiple-target tracking. The radar scene is characterized by inducing a state-space model and examining the structure of its underlying Markov state transition matrix, in terms of entropy rate and diagonality. It is found that entropy rate is a strong predictor of online learning-based waveform selection, while diagonality is a better predictor of fixed rule-based waveform selection. We show that these measures can be used to predict first and second-order stochastic dominance relationships, which can allow system designers to make use of simple decision rules instead of more cumbersome learning approaches under certain conditions. We validate our findings through numerical results for each application and provide guidelines for future implementations.
翻译:本文试图刻画在何种物理场景下,基于在线学习的认知雷达有望持续优于固定规则波形选择策略,反之亦然。我们通过分析两种决策场景(即动态频谱接入与多目标跟踪)寻求普适性见解。通过构建状态空间模型并考察其底层马尔可夫状态转移矩阵在熵率和对角线性方面的结构特征,对雷达场景进行刻画。研究发现:熵率是在线学习波形选择的强预测指标,而对角线性则是固定规则波形选择的更优预测指标。我们证明这些度量可用于预测一阶和二阶随机占优关系,从而使系统设计者在特定条件下能够利用简单决策规则替代更复杂的机器学习方法。通过各应用场景的数值结果验证了研究发现,并为未来实施提供指导准则。