The Rapid Carbon Assessment, conducted by the U.S. Department of Agriculture, was implemented in order to obtain a representative sample of soil organic carbon across the contiguous United States. In conjunction with a statistical model, the dataset allows for mapping of soil carbon prediction across the U.S., however there are two primary challenges to such an effort. First, there exists a large degree of heterogeneity in the data, whereby both the first and second moments of the data generating process seem to vary both spatially and for different land-use categories. Second, the majority of the sampled locations do not actually have lab measured values for soil organic carbon. Rather, visible and near-infrared (VNIR) spectra were measured at most locations, which act as a proxy to help predict carbon content. Thus, we develop a heterogeneous model to analyze this data that allows both the mean and the variance to vary as a function of space as well as land-use category, while incorporating VNIR spectra as covariates. After a cross-validation study that establishes the effectiveness of the model, we construct a complete map of soil organic carbon for the contiguous U.S. along with uncertainty quantification.
翻译:为获取美国本土具有代表性的土壤有机碳样本,美国农业部开展了快速碳评估项目。结合统计模型,该数据集支持绘制全美土壤碳含量预测图,但实施过程中存在两大挑战:首先,数据呈现高度异质性,其生成过程的一阶矩与二阶矩均表现出随空间位置及土地利用类型变化的现象;其次,大部分采样点缺乏实验室测定的土壤有机碳值,而主要依赖可见-近红外光谱作为代理指标进行碳含量预测。为此,我们构建了异质性分析模型,该模型在将VNIR光谱作为协变量的同时,允许均值和方差随空间位置及土地利用类型变化。通过交叉验证研究证实模型有效性后,我们绘制了美国本土土壤有机碳完整分布图并进行了不确定性量化。