Separate Exchangeability as Modeling Principle in Bayesian Nonparametrics

We argue for the use of separate exchangeability as a modeling principle in Bayesian nonparametric (BNP) inference. Separate exchangeability is \emph{de facto} widely applied in the Bayesian parametric case, e.g., it naturally arises in simple mixed models. However, while in some areas, such as random graphs, separate and (closely related) joint exchangeability are widely used, it is curiously underused for several other applications in BNP. We briefly review the definition of separate exchangeability focusing on the implications of such a definition in Bayesian modeling. We then discuss two tractable classes of models that implement separate exchangeability that are the natural counterparts of familiar partially exchangeable BNP models. The first is nested random partitions for a data matrix, defining a partition of columns and nested partitions of rows, nested within column clusters. Many recent models for nested partitions implement partially exchangeable models related to variations of the well-known nested Dirichlet process. We argue that inference under such models in some cases ignores important features of the experimental setup. We obtain the separately exchangeable counterpart of such partially exchangeable partition structures. The second class is about setting up separately exchangeable priors for a nonparametric regression model when multiple sets of experimental units are involved. We highlight how a Dirichlet process mixture of linear models known as ANOVA DDP can naturally implement separate exchangeability in such regression problems. Finally, we illustrate how to perform inference under such models in two real data examples.

翻译：本文主张将分离可交换性作为贝叶斯非参数（BNP）推断中的建模原则。分离可交换性在贝叶斯参数模型中已被广泛采用（例如在简单混合模型中自然出现），但在BNP的诸多应用领域中却未得到充分重视——尽管在随机图等某些领域，分离可交换性及其密切相关的联合可交换性已被普遍使用。本文首先简要回顾分离可交换性的定义，重点阐述该定义对贝叶斯建模的启示。随后讨论两类可实现分离可交换性的易处理模型，它们与常见的部分可交换BNP模型形成自然对应。第一类是针对数据矩阵的嵌套随机划分，该模型定义列划分及嵌套于列簇内的行嵌套划分。当前许多嵌套划分模型实现了与经典嵌套狄利克雷过程变体相关的部分可交换模型，我们认为此类模型的推断在某些情况下会忽略实验设置的重要特征。为此，我们构建了与这类部分可交换划分结构相对应的分离可交换版本。第二类涉及在涉及多组实验单元的非参数回归模型中建立分离可交换先验。我们通过方差分析DDP（一种线性模型的狄利克雷过程混合模型）阐明如何在此类回归问题中自然实现分离可交换性。最后，通过两个实际数据案例展示如何在此类模型下进行统计推断。