Optimal Dorfman Group Testing for Symmetric Distributions

We study Dorfman's classical group testing protocol in a novel setting where individual specimen statuses are modeled as exchangeable random variables. We are motivated by infectious disease screening. In that case, specimens which arrive together for testing often originate from the same community and so their statuses may exhibit positive correlation. Dorfman's protocol screens a population of n specimens for a binary trait by partitioning it into non-overlapping groups, testing these, and only individually retesting the specimens of each positive group. The partition is chosen to minimize the expected number of tests under a probabilistic model of specimen statuses. We relax the typical assumption that these are independent and identically distributed and instead model them as exchangeable random variables. In this case, their joint distribution is symmetric in the sense that it is invariant under permutations. We give a characterization of such distributions in terms of a function q where q(h) is the marginal probability that any group of size h tests negative. We use this interpretable representation to show that the set partitioning problem arising in Dorfman's protocol can be reduced to an integer partitioning problem and efficiently solved. We apply these tools to an empirical dataset from the COVID-19 pandemic. The methodology helps explain the unexpectedly high empirical efficiency reported by the original investigators.

翻译：我们研究多夫曼经典群体检测协议在新场景下的应用，其中个体标本状态被建模为可交换随机变量。我们的研究动机源于传染病筛查。在此类场景中，同时送达检测的标本常源自同一社区，其状态可能存在正相关性。多夫曼协议通过将n个标本的群体划分为互不重叠的子组进行二元性状筛查：先对各子组进行检测，仅对阳性子组中的个体进行复检。该划分方案旨在基于标本状态概率模型最小化期望检测次数。我们放宽了标本状态独立同分布的传统假设，将其建模为可交换随机变量。在此情况下，其联合分布具有置换不变性，即呈现对称特征。我们通过函数q给出此类分布的特征化表示，其中q(h)表示任意大小为h的子组检测结果为阴性的边际概率。利用该可解释表示，我们证明多夫曼协议中产生的集合划分问题可转化为整数划分问题并高效求解。我们将该方法应用于COVID-19大流行期间的实证数据集，该分析有助于解释原始研究者报告的超预期高经验效率。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日