SoilNet: A Multimodal Multitask Model for Hierarchical Classification of Soil Horizons

Recent advances in artificial intelligence (AI), in particular foundation models, have improved the state of the art in many application domains including geosciences. Some specific problems, however, could not benefit from this progress yet. Soil horizon classification, for instance, remains challenging because of its multimodal and multitask characteristics and a complex hierarchically structured label taxonomy. Accurate classification of soil horizons is crucial for monitoring soil condition. In this work, we propose \textit{SoilNet} - a multimodal multitask model to tackle this problem through a structured modularized pipeline. In contrast to omnipurpose AI foundation models, our approach is designed to be inherently transparent by following the task structure human experts developed for solving this challenging annotation task. The proposed approach integrates image data and geotemporal metadata to first predict depth markers, segmenting the soil profile into horizon candidates. Each segment is characterized by a set of horizon-specific morphological features. Finally, horizon labels are predicted based on the multimodal concatenated feature vector, leveraging a graph-based label representation to account for the complex hierarchical relationships among soil horizons. Our method is designed to address complex hierarchical classification, where the number of possible labels is very large, imbalanced and non-trivially structured. We demonstrate the effectiveness of our approach on a real-world soil profile dataset and a comprehensive user study with domain experts. Our empirical evaluations demonstrate that SoilNet reliably predicts soil horizons that are plausible and accurate. User study results indicate that SoilNet achieves predictive performance on par with or better than that of human experts. All code can be found at: https://github.com/calgo-lab/BGR/

翻译：人工智能（AI）领域的最新进展，特别是基础模型，已推动包括地球科学在内的众多应用领域的技术水平提升。然而，一些特定问题尚未能受益于此项进展。例如，土壤剖面分类因其多模态与多任务特性，以及复杂的层次化标签分类体系，仍然具有挑战性。土壤剖面的精确分类对于监测土壤状况至关重要。在本工作中，我们提出 \textit{SoilNet}——一种多模态多任务模型，通过结构化的模块化流程来解决此问题。与通用型AI基础模型不同，我们的方法遵循人类专家为解决这一具有挑战性的标注任务而设计的任务结构，旨在实现内在的透明性。所提出的方法整合图像数据和地理时空元数据，首先预测深度标记，将土壤剖面分割为候选剖面层。每个分段由一组剖面层特定的形态学特征进行表征。最后，基于多模态拼接特征向量预测剖面层标签，并利用基于图的标签表示来捕捉土壤剖面层之间复杂的层次关系。我们的方法旨在处理复杂的层次分类问题，其中可能的标签数量极大、分布不平衡且结构复杂。我们在真实世界的土壤剖面数据集上以及通过与领域专家进行的全面用户研究，证明了我们方法的有效性。我们的实证评估表明，SoilNet 能够可靠地预测出合理且准确的土壤剖面层。用户研究结果表明，SoilNet 的预测性能达到或优于人类专家水平。所有代码可见于：https://github.com/calgo-lab/BGR/