Optimal Dorfman Group Testing For Symmetric Distributions

We study Dorfman's classical group testing protocol in a novel setting where individual specimen statuses are modeled as exchangeable random variables. We are motivated by infectious disease screening. In that case, specimens which arrive together for testing often originate from the same community and so their statuses may exhibit positive correlation. Dorfman's protocol screens a population of n specimens for a binary trait by partitioning it into nonoverlapping groups, testing these, and only individually retesting the specimens of each positive group. The partition is chosen to minimize the expected number of tests under a probabilistic model of specimen statuses. We relax the typical assumption that these are independent and indentically distributed and instead model them as exchangeable random variables. In this case, their joint distribution is symmetric in the sense that it is invariant under permutations. We give a characterization of such distributions in terms of a function q where q(h) is the marginal probability that any group of size h tests negative. We use this interpretable representation to show that the set partitioning problem arising in Dorfman's protocol can be reduced to an integer partitioning problem and efficiently solved. We apply these tools to an empirical dataset from the COVID-19 pandemic. The methodology helps explain the unexpectedly high empirical efficiency reported by the original investigators.

翻译：我们研究Dorfman经典群组检测协议在新场景中的应用，该场景中个体样本状态被建模为可交换随机变量。研究受传染病筛查需求驱动。在此类应用中，同时送检的样本常源于同一社区，其状态可能呈现正相关性。Dorfman协议通过将包含n个样本的群体划分为非重叠组进行检测，仅对阳性组中的每个样本进行二次单独检测，以实现二元性状的筛查。该分组方案旨在最小化基于样本状态概率模型下的预期检测次数。我们放宽了样本独立同分布的传统假设，转而将其建模为可交换随机变量。在此条件下，样本的联合分布具有置换不变性所体现的对称特征。我们通过函数q刻画此类分布，其中q(h)表示任意规模为h的组别检测结果为阴性的边际概率。基于这一可解释的表示形式，我们证明Dorfman协议中的集合划分问题可简化为整数划分问题，并可通过高效算法求解。我们将该方法应用于COVID-19疫情实证数据集。该模型有助于解释原始研究者报告的高于预期的实证效率。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日