This thesis investigates the application of near-infrared hyperspectral imaging (NIR-HSI) for food quality analysis. The investigation is conducted through four studies operating with five research hypotheses. For several analyses, the studies compare models based on convolutional neural networks (CNNs) and partial least squares (PLS). Generally, joint spatio-spectral analysis with CNNs outperforms spatial analysis with CNNs and spectral analysis with PLS when modeling parameters where chemical and physical visual information are relevant. When modeling chemical parameters with a 2-dimensional (2D) CNN, augmenting the CNN with an initial layer dedicated to performing spectral convolution enhances its predictive performance by learning a spectral preprocessing similar to that applied by domain experts. Still, PLS-based spectral modeling performs equally well for analysis of the mean content of chemical parameters in samples and is the recommended approach. Modeling the spatial distribution of chemical parameters with NIR-HSI is limited by the ability to obtain spatially resolved reference values. Therefore, a study used bulk mean references for chemical map generation of fat content in pork bellies. A PLS-based approach gave non-smooth chemical maps and pixel-wise predictions outside the range of 0-100\%. Conversely, a 2D CNN augmented with a spectral convolution layer mitigated all issues arising with PLS. The final study attempted to model barley's germinative capacity by analyzing NIR spectra, RGB images, and NIR-HSI images. However, the results were inconclusive due to the dataset's low degree of germination. Additionally, this thesis has led to the development of two open-sourced Python packages. The first facilitates fast PLS-based modeling, while the second facilitates very fast cross-validation of PLS and other classical machine learning models with a new algorithm.
翻译:本论文研究了近红外高光谱成像(NIR-HSI)在食品质量分析中的应用。该研究通过四项研究,基于五个研究假设展开。在多项分析中,研究比较了基于卷积神经网络(CNN)和偏最小二乘法(PLS)的模型。总体而言,在建模涉及化学与物理视觉信息的参数时,采用CNN的联合空谱分析优于采用CNN的空间分析以及采用PLS的光谱分析。在使用二维(2D)CNN对化学参数建模时,通过添加一个专门执行光谱卷积的初始层来增强CNN,使其能够学习类似于领域专家所采用的光谱预处理方法,从而提升其预测性能。然而,对于分析样品中化学参数的平均含量,基于PLS的光谱建模表现同样出色,是推荐采用的方法。利用NIR-HSI对化学参数的空间分布进行建模,受限于获取空间分辨参考值的能力。因此,一项研究使用整体平均参考值来生成猪五花肉脂肪含量的化学分布图。基于PLS的方法产生了不平滑的化学分布图,且像素级预测值超出了0-100%的范围。相反,通过添加光谱卷积层增强的2D CNN缓解了PLS带来的所有问题。最后一项研究尝试通过分析近红外光谱、RGB图像和NIR-HSI图像来建模大麦的发芽能力。然而,由于数据集中发芽程度较低,结果尚无定论。此外,本论文还促成了两个开源Python软件包的开发。第一个包促进了基于PLS的快速建模,而第二个包则通过一种新算法,实现了对PLS及其他经典机器学习模型的极快速交叉验证。