Simulating soil reflectance spectra is invaluable for soil-plant radiative modeling and training machine learning models, yet it is difficult as the intricate relationships between soil structure and its constituents. To address this, a fully data-driven soil optics generative model (SOGM) for simulation of soil reflectance spectra based on soil property inputs was developed. The model is trained on an extensive dataset comprising nearly 180,000 soil spectra-property pairs from 17 datasets. It generates soil reflectance spectra from text-based inputs describing soil properties and their values rather than only numerical values and labels in binary vector format. The generative model can simulate output spectra based on an incomplete set of input properties. SOGM is based on the denoising diffusion probabilistic model (DDPM). Two additional sub-models were also built to complement the SOGM: a spectral padding model that can fill in the gaps for spectra shorter than the full visible-near-infrared range (VIS-NIR; 400 to 2499 nm), and a wet soil spectra model that can estimate the effects of water content on soil reflectance spectra given the dry spectrum predicted by the SOGM. The SOGM was up-scaled by coupling with the Helios 3D plant modeling software, which allowed for generation of synthetic aerial images of simulated soil and plant scenes. It can also be easily integrated with soil-plant radiation model used for remote sensin research like PROSAIL. The testing results of the SOGM on new datasets that not included in model training proved that the model can generate reasonable soil reflectance spectra based on available property inputs. The presented models are openly accessible on: https://github.com/GEMINI-Breeding/SOGM_soil_spectra_simulation.
翻译:模拟土壤反射光谱对于土壤-植物辐射建模和训练机器学习模型具有重要价值,但由于土壤结构及其组分之间错综复杂的关系,这一过程颇具挑战性。为此,本研究开发了一种全数据驱动的土壤光学生成模型(SOGM),该模型基于土壤属性输入模拟土壤反射光谱。模型基于包含来自17个数据集近18万组土壤光谱-属性配对的大规模数据集进行训练。与仅采用数值和二进制向量标签的输入方式不同,SOGM通过描述土壤属性及其数值的文本输入生成光谱。该生成模型能够基于不完整的属性输入集模拟输出光谱。SOGM基于去噪扩散概率模型(DDPM)构建。此外,还开发了两个补充子模型:光谱填充模型,可补全短于全可见光-近红外波段范围(VIS-NIR;400至2499纳米)的光谱缺失部分;以及湿土光谱模型,可根据SOGM预测的干土光谱估算含水量对土壤反射光谱的影响。通过耦合Helios 3D植物建模软件,实现了SOGM的尺度升级,从而能够生成模拟土壤与植物场景的合成航空图像。该模型也可轻松集成至用于遥感研究的土壤-植物辐射模型(如PROSAIL)中。在未参与模型训练的新数据集上的测试结果表明,SOGM能够基于可获取的属性输入生成合理的土壤反射光谱。所提出的模型已在以下地址开放获取:https://github.com/GEMINI-Breeding/SOGM_soil_spectra_simulation。