Hyperspectral Imaging (HSI) serves as a non-destructive spatial spectroscopy technique with a multitude of potential applications. However, a recurring challenge lies in the limited size of the target datasets, impeding exhaustive architecture search. Consequently, when venturing into novel applications, reliance on established methodologies becomes commonplace, in the hope that they exhibit favorable generalization characteristics. Regrettably, this optimism is often unfounded due to the fine-tuned nature of models tailored to specific HSI contexts. To address this predicament, this study introduces an innovative benchmark dataset encompassing three markedly distinct HSI applications: food inspection, remote sensing, and recycling. This comprehensive dataset affords a finer assessment of hyperspectral model capabilities. Moreover, this benchmark facilitates an incisive examination of prevailing state-of-the-art techniques, consequently fostering the evolution of superior methodologies. Furthermore, the enhanced diversity inherent in the benchmark dataset underpins the establishment of a pretraining pipeline for HSI. This pretraining regimen serves to enhance the stability of training processes for larger models. Additionally, a procedural framework is delineated, offering insights into the handling of applications afflicted by limited target dataset sizes.
翻译:高光谱成像(HSI)作为一种无损空间光谱技术,具有众多潜在应用。然而,一个反复出现的挑战在于目标数据集规模有限,阻碍了全面的架构搜索。因此,在探索新型应用时,人们通常依赖已有方法,期望它们展现出良好的泛化特性。遗憾的是,由于模型针对特定高光谱场景的微调特性,这种乐观预期往往落空。为解决这一困境,本研究引入了一个创新的基准数据集,涵盖三个截然不同的高光谱成像应用:食品检测、遥感与回收利用。这一综合数据集能够更精细地评估高光谱模型能力。此外,该基准有助于深入剖析现有先进技术,从而推动更优方法论的演进。更进一步,基准数据集本身所具备的增强多样性,为建立高光谱成像的预训练流程奠定了基础。这一预训练机制有助于提升大型模型训练过程的稳定性。同时,本文还勾勒出一个流程框架,为处理受限于目标数据集规模的应用提供了见解。