The spread of an undesirable contact process, such as an infectious disease (e.g. COVID-19), is contained through testing and isolation of infected nodes. The temporal and spatial evolution of the process (along with containment through isolation) render such detection as fundamentally different from active search detection strategies. In this work, through an active learning approach, we design testing and isolation strategies to contain the spread and minimize the cumulative infections under a given test budget. We prove that the objective can be optimized, with performance guarantees, by greedily selecting the nodes to test. We further design reward-based methodologies that effectively minimize an upper bound on the cumulative infections and are computationally more tractable in large networks. These policies, however, need knowledge about the nodes' infection probabilities which are dynamically changing and have to be learned by sequential testing. We develop a message-passing framework for this purpose and, building on that, show novel tradeoffs between exploitation of knowledge through reward-based heuristics and exploration of the unknown through a carefully designed probabilistic testing. The tradeoffs are fundamentally distinct from the classical counterparts under active search or multi-armed bandit problems (MABs). We provably show the necessity of exploration in a stylized network and show through simulations that exploration can outperform exploitation in various synthetic and real-data networks depending on the parameters of the network and the spread.
翻译:不良接触过程(如COVID-19等传染病)的传播,是通过检测和隔离受感染节点加以遏制的。该过程的时间与空间演化(以及通过隔离实施的遏制)使得此类检测与主动搜索检测策略存在根本差异。在本工作中,我们通过主动学习方法设计检测与隔离策略,在给定检测预算下遏制传播并最小化累计感染人数。我们证明,通过贪婪选择待检测节点,该目标可在性能保证下实现优化。我们进一步设计了基于奖励的方法论,该方法能有效最小化累计感染人数的上界,并在大型网络中具有更高的计算可行性。然而,这些策略需要掌握节点感染概率的相关知识——这些概率动态变化,必须通过序贯检测来学习。为此,我们开发了一个消息传递框架,并在此基础上揭示了通过基于奖励的启发式方法利用已知信息与通过精心设计的概率检测探索未知信息之间的新型权衡。该权衡与主动搜索或多臂老虎机问题中的经典权衡存在本质区别。我们在一个简约网络中严格证明了探索的必要性,并通过仿真表明,在网络结构与传播参数的影响下,探索策略在多种合成网络与真实数据网络中可优于利用策略。