Theoretical and Practical Progress in Hyperspectral Pixel Unmixing with Large Spectral Libraries from a Sparse Perspective

Hyperspectral unmixing is the process of determining the presence of individual materials and their respective abundances from an observed pixel spectrum. Unmixing is a fundamental process in hyperspectral image analysis, and is growing in importance as increasingly large spectral libraries are created and used. Unmixing is typically done with ordinary least squares (OLS) regression. However, unmixing with large spectral libraries where the materials present in a pixel are not a priori known, solving for the coefficients in OLS requires inverting a non-invertible matrix from a large spectral library. A number of regression methods are available that can produce a numerical solution using regularization, but with considerably varied effectiveness. Also, simple methods that are unpopular in the statistics literature (i.e. step-wise regression) are used with some level of effectiveness in hyperspectral analysis. In this paper, we provide a thorough performance evaluation of the methods considered, evaluating methods based on how often they select the correct materials in the models. Investigated methods include ordinary least squares regression, non-negative least squares regression, ridge regression, lasso regression, step-wise regression and Bayesian model averaging. We evaluated these unmixing approaches using multiple criteria: incorporation of non-negative abundances, model size, accurate mineral detection and root mean squared error (RMSE). We provide a taxonomy of the regression methods, showing that most methods can be understood as Bayesian methods with specific priors. We conclude that methods that can be derived with priors that correspond to the phenomenology of hyperspectral imagery outperform those with priors that are optimal for prediction performance under the assumptions of ordinary least squares linear regression.

翻译：高光谱解混是从观测到的像素光谱中确定单个材料的存在及其相应丰度的过程。解混是高光谱图像分析的基本过程，随着日益庞大的光谱库被创建和使用，其重要性日益凸显。解混通常通过普通最小二乘（OLS）回归完成。然而，当使用大规模光谱库进行解混且像素中存在的材料并非先验已知时，求解OLS系数需要对来自大型光谱库的不可逆矩阵进行求逆。现有多种回归方法可通过正则化产生数值解，但其有效性差异显著。此外，在统计学文献中不受青睐的简单方法（如逐步回归）在高光谱分析中亦展现出一定程度的有效性。本文对所考虑的方法进行了全面的性能评估，重点评估各方法在模型中选择正确材料的频率。研究的方法包括普通最小二乘回归、非负最小二乘回归、岭回归、lasso回归、逐步回归和贝叶斯模型平均。我们通过多重标准评估这些解混方法：非负丰度约束、模型规模、准确矿物检测和均方根误差（RMSE）。我们提出了回归方法的分类体系，表明大多数方法可被理解为具有特定先验分布的贝叶斯方法。研究结论表明，那些能够通过符合高光谱图像现象学特性的先验分布推导出的方法，其性能优于那些在普通最小二乘线性回归假设下为预测性能优化的先验分布所推导的方法。