For marine biologists, ascertaining the dependence structures between marine species and marine environments, such as sea surface temperature and ocean depth, is imperative for defining ecosystem functioning and providing insights into the dynamics of marine ecosystems. However, obtained data include not only continuous but also discrete data, such as binaries and counts (referred to as mixed outcomes), as well as spatial correlations, both of which make conventional multivariate analysis tools impractical. To solve this issue, we propose semiparametric Bayesian inference and develop an efficient algorithm for computing the posterior of the dependence structure based on the rank likelihood under a latent multivariate spatial Gaussian process using the Markov chain Monte Carlo method. To alleviate the computational intractability caused by the Gaussian process, we also provide a scalable implementation that leverages the nearest-neighbor Gaussian process. Extensive numerical experiments reveal that the proposed method reliably infers the dependence structures of spatially correlated mixed outcomes. Finally, we apply the proposed method to a dataset collected during an international synoptic krill survey in the Scotia Sea of the Antarctic Peninsula to infer the dependence structure between fin whales (Balaenoptera physalus), krill biomass, and relevant oceanographic data.
翻译:对于海洋生物学家而言,确定海洋物种(如长须鲸)与海洋环境因子(如海表温度和海洋深度)之间的依赖结构,对于界定生态系统功能、理解海洋生态系统动态至关重要。然而,获取的数据不仅包含连续变量,还包括离散数据(如二元变量和计数数据,统称为混合结果),且存在空间相关性,这使得传统的多元分析方法难以适用。为解决此问题,我们提出了半参数贝叶斯推断方法,并基于秩似然,利用马尔可夫链蒙特卡洛方法,开发了一种高效算法,用于计算潜变量多元空间高斯过程下依赖结构的后验分布。为缓解高斯过程带来的计算难题,我们还提供了一种可扩展的实现方案,该方案利用了最近邻高斯过程。大量数值实验表明,所提方法能够可靠地推断空间相关混合结果的依赖结构。最后,我们将所提方法应用于南极半岛斯科舍海国际同步磷虾调查中收集的数据集,以推断长须鲸、磷虾生物量及相关海洋学数据之间的依赖结构。