In compositional data, detecting which part of the whole delineates heterogeneity is important. The aim is to propose a procedure to quantify this term in the multivariate regression context without abandoning the data's natural restriction. A single probabilistic model with a hierarchical structure was built for multiple compositional data. An objective criterion based on skewness and kurtosis metrics provides support to characterize each component's performance as well as to assist in choosing one component as a reference avoiding model identifiability issues. The inference procedure was done under the Bayesian approach using the Hamiltonian Monte Carlo (HMC) method to obtain the posterior distribution of interest. The Kullback-Leibler divergence (KLD) from information theory and the Aitchison distance metrics are calculated to compute the similarity between compositions to compare scenarios in the model validation process. The proposal was motivated by a composition structure with high uncertainty in the Abrolhos Reefs of Brazil as a consequence of a dam rupture. The results support an understanding of patterns in the studied process recognizing local effects on each component as well as quantifying the precision parameter. These highlights contribute to characterizing the marine life community in areas that were affected by anthropogenic damage.
翻译:在成分数据中,检测整体中哪一部分定义了异质性具有重要意义。本研究旨在提出一个程序,在多元回归背景下量化这一项,同时不放弃数据的自然约束。为多个成分数据构建了一个具有层次结构的单一概率模型。基于偏度和峰度指标的客观标准为表征每个成分的表现提供了支持,并有助于选择一个成分作为参考,从而避免模型可识别性问题。推理过程采用贝叶斯方法,结合汉密尔顿蒙特卡洛(HMC)方法获取感兴趣的后验分布。计算信息论中的Kullback-Leibler散度(KLD)和艾奇逊距离度量,以评估成分之间的相似性,从而在模型验证过程中比较不同情景。该研究的动机源于巴西阿布罗略斯礁因大坝破裂导致的成分结构具有高度不确定性。结果支持对研究过程模式的理解,识别每个成分的局部效应,并量化精度参数。这些发现有助于表征受人为损害影响的区域中海洋生物群落的特征。