Binary responses arise in a multitude of statistical problems, including binary classification, bioassay, current status data problems and sensitivity estimation. There has been an interest in such problems in the Bayesian nonparametrics community since the early 1970s, but inference given binary data is intractable for a wide range of modern simulation-based models, even when employing MCMC methods. Recently, Christensen (2023) introduced a novel simulation technique based on counting permutations, which can estimate both posterior distributions and marginal likelihoods for any model from which a random sample can be generated. However, the accompanying implementation of this technique struggles when the sample size is too large (n > 250). Here we present perms, a new implementation of said technique which is substantially faster and able to handle larger data problems than the original implementation. It is available both as an R package and a Python library. The basic usage of perms is illustrated via two simple examples: a tractable toy problem and a bioassay problem. A more complex example involving changepoint analysis is also considered. We also cover the details of the implementation and illustrate the computational speed gain of perms via a simple simulation study.
翻译:摘要:二元响应出现在众多统计问题中,包括二元分类、生物测定、当前状态数据问题以及灵敏度估计。自20世纪70年代初以来,贝叶斯非参数领域的学者一直对这类问题感兴趣,但对于广泛的现代基于模拟的模型,即使采用MCMC方法,基于二元数据进行推断也是难以处理的。最近,Christensen(2023)引入了一种基于计数排列的新颖模拟技术,该技术能够估计任何可生成随机样本的模型的后验分布和边际似然。然而,该技术的伴随实现在样本量过大(n > 250)时面临困难。在此我们提出perms,这是对该技术的新实现,比原始实现更快,且能够处理更大的数据问题。它既可以作为R包也可以作为Python库使用。通过两个简单示例说明了perms的基本用法:一个可处理的玩具问题和一个生物测定问题。还考虑了一个涉及变点分析的更复杂示例。我们还涵盖了实现的细节,并通过一个简单的模拟研究说明了perms的计算速度提升。