Clusters of similar or dissimilar objects are encountered in many fields. Frequently used approaches treat the central object of each cluster as latent. Yet, often objects of one or more types cluster around objects of another type. Such arrangements are common in biomedical images of cells, in which nearby cell types likely interact. Quantifying spatial relationships may elucidate biological mechanisms. Parent-offspring statistical frameworks can be usefully applied even when central objects (parents) differ from peripheral ones (offspring). We propose the novel multivariate cluster point process (MCPP) to quantify multi-object (e.g., multi-cellular) arrangements. Unlike commonly used approaches, the MCPP exploits locations of the central parent object in clusters. It accounts for possibly multilayered, multivariate clustering. The model formulation requires specification of which object types function as cluster centers and which reside peripherally. If such information is unknown, the relative roles of object types may be explored by comparing fit of different models via the deviance information criterion (DIC). In simulated data, we compared DIC of a series of models; the MCPP correctly identified simulated relationships. It also produced more accurate and precise parameter estimates than the classical univariate Neyman-Scott process model. We also used the MCPP to quantify proposed configurations and explore new ones in human dental plaque biofilm image data. MCPP models quantified simultaneous clustering of Streptococcus and Porphyromonas around Corynebacterium and of Pasteurellaceae around Streptococcus and successfully captured hypothesized structures for all taxa. Further exploration suggested the presence of clustering between Fusobacterium and Leptotrichia, a previously unreported relationship.
翻译:在许多领域中都会遇到相似或相异对象的聚类现象。常用方法将每个聚类的中心对象视为隐变量。然而,常见的情形是一种或多种类型的对象围绕另一种类型的对象形成聚类。此类空间构型在细胞生物医学图像中普遍存在,其中邻近的细胞类型很可能存在相互作用。量化空间关系有助于阐明生物学机制。即使中心对象(父代)与外围对象(子代)存在差异,父代-子代统计框架仍可有效应用。本文提出新颖的多元聚类点过程(MCPP)以量化多对象(例如多细胞)的空间构型。与常用方法不同,MCPP利用聚类中作为中心的父代对象位置信息。该模型能够处理可能存在的多层、多元聚类结构。模型构建需要明确指定哪些对象类型作为聚类中心,哪些处于外围。若此类信息未知,可通过比较不同模型的偏差信息准则(DIC)来探索对象类型的相对角色。在模拟数据中,我们比较了一系列模型的DIC;MCPP正确识别了模拟的空间关系。与经典的单变量Neyman-Scott过程模型相比,MCPP还提供了更准确、更精确的参数估计。我们进一步应用MCPP量化人类牙菌斑生物膜图像数据中的已知构型并探索新构型。MCPP模型成功量化了链球菌和卟啉单胞菌围绕棒状杆菌的同步聚类现象,以及巴斯德菌科围绕链球菌的聚类现象,并准确捕捉了所有分类单元的假设结构。进一步探索揭示了梭杆菌与纤毛菌之间存在聚类关系,这是此前未被报道的空间关联。