Feature and trait allocation models are fundamental objects in Bayesian nonparametrics and play a prominent role in several applications. Existing approaches, however, typically assume full exchangeability of the data, which may be restrictive in settings characterized by heterogeneous but related groups. In this paper, we introduce a general and tractable class of Bayesian nonparametric priors for partially exchangeable trait allocation models, relying on completely random vectors. We provide a comprehensive theoretical analysis, including closed-form expressions for marginal and posterior distributions, and illustrate the tractability of our framework in the cases of binary and Poisson-distributed traits. A distinctive aspect of our approach is that the number of traits is a random quantity, thereby allowing us to model and estimate unobserved traits. Building on these results, we also develop a novel mixture model that infers the group partition structure from the data, effectively clustering trait allocations. This extension generalizes Bayesian nonparametric latent class models and avoids the systematic overclustering that arises when the number of traits is assumed to be fixed. We demonstrate the practical usefulness of our methodology through an application to the `Ndrangheta criminal network from the Operazione Infinito investigation, where our model provides insights into the organization of illicit activities.
翻译:特征与特征分配模型是贝叶斯非参数统计中的基础对象,在众多应用中具有重要地位。然而,现有方法通常假设数据具有完全可交换性,这在存在异质但相关群体的场景中可能显得局限。本文针对部分可交换特征分配模型,基于完全随机向量提出了一类通用且易处理的贝叶斯非参数先验分布。我们提供了完整的理论分析,包括边缘分布与后验分布的闭式表达式,并通过二值特征与泊松分布特征的案例展示了本框架的易处理性。本方法的一个显著特点是特征数量为随机变量,从而能够对未观测特征进行建模与估计。基于这些结果,我们还提出了一种新的混合模型,能够从数据中推断群体划分结构,实现对特征分配的有效聚类。该扩展推广了贝叶斯非参数潜在类别模型,并避免了固定特征数量假设导致的系统性过聚类问题。我们通过应用于'无限行动'调查中的'光荣会'犯罪网络案例,展示了本方法的实用价值——该模型为非法活动的组织模式提供了新的洞察。