Leveraging available measurements of our environment can help us understand complex processes. One example is Argo Biogeochemical data, which aims to collect measurements of oxygen, nitrate, pH, and other variables at varying depths in the ocean. We focus on the oxygen data in the Southern Ocean, which has implications for ocean biology and the Earth's carbon cycle. Systematic monitoring of such data has only recently begun to be established, and the data is sparse. In contrast, Argo measurements of temperature and salinity are much more abundant. In this work, we introduce and estimate a functional regression model describing dependence in oxygen, temperature, and salinity data at all depths covered by the Argo data simultaneously. Our model elucidates important aspects of the joint distribution of temperature, salinity, and oxygen. Due to fronts that establish distinct spatial zones in the Southern Ocean, we augment this functional regression model with a mixture component. By modelling spatial dependence in the mixture component and in the data itself, we provide predictions onto a grid and improve location estimates of fronts. Our approach is scalable to the size of the Argo data, and we demonstrate its success in cross-validation and a comprehensive interpretation of the model.
翻译:利用现有的环境测量数据有助于我们理解复杂过程。Argo生物地球化学数据即为一例,其旨在收集海洋不同深度处的氧气、硝酸盐、pH值及其他变量的测量值。本研究聚焦于对海洋生物学及地球碳循环具有重要意义的南大洋氧气数据。此类数据的系统性监测近期才逐步建立,且数据较为稀疏。相比之下,Argo对温度和盐度的测量数据则丰富得多。本文提出并估计了一种功能回归模型,该模型能同时描述Argo数据覆盖的所有深度上氧气、温度与盐度数据间的依赖关系。我们的模型阐明了温度、盐度与氧气联合分布的重要特征。鉴于南大洋存在形成不同空间区域的前锋带,我们在该功能回归模型中引入了混合成分。通过对混合成分及数据本身的空间依赖性进行建模,我们实现了网格化预测并提升了前锋带位置估计的精度。该方法可扩展至Argo数据的规模,我们通过交叉验证及对模型的全面阐释证明了其有效性。