The attention towards food products characteristics, such as nutritional properties and traceability, has risen substantially in the recent years. Consequently, we are witnessing an increased demand for the development of modern tools to monitor, analyse and assess food quality and authenticity. Within this framework, an essential set of data collection techniques is provided by vibrational spectroscopy. In fact, methods such as Fourier near infrared and mid infrared spectroscopy have been often exploited to analyze different foodstuffs. Nonetheless, existing statistical methods often struggle to deal with the challenges presented by spectral data, such as their high dimensionality, paired with strong relationships among the wavelengths. Therefore, the definition of proper statistical procedures accounting for the peculiarities of spectroscopy data is paramount. In this work, motivated by two dairy science applications, we propose an adaptive functional regression framework for spectroscopy data. The method stems from the trend filtering literature, allowing the definition of a highly flexible and adaptive estimator able to handle different degrees of smoothness. We provide a fast optimization procedure that is suitable for both Gaussian and non Gaussian scalar responses, and allows for the inclusion of scalar covariates. Moreover, we develop inferential procedures for both the functional and the scalar component thus enhancing not only the interpretability of the results, but also their usability in real world scenarios. The method is applied to two sets of MIR spectroscopy data, providing excellent results when predicting milk chemical composition and cows' dietary treatments. Moreover, the developed inferential routine provides relevant insights, potentially paving the way for a richer interpretation and a better understanding of the impact of specific wavelengths on milk features.
翻译:近年来,人们对食品特性(如营养属性和可追溯性)的关注显著增加。因此,我们目睹了对开发现代工具以监测、分析和评估食品质量与真实性的需求日益增长。在这一框架下,振动光谱学提供了一组重要的数据采集技术。事实上,诸如傅里叶近红外和中红外光谱等方法已被频繁用于分析不同食品。然而,现有统计方法往往难以应对光谱数据带来的挑战,例如其高维度性以及波长之间的强相关性。因此,定义能够适应光谱数据特性的恰当统计程序至关重要。在这项工作中,受两个乳制品科学应用的启发,我们提出了一种用于光谱数据的自适应函数回归框架。该方法源于趋势滤波文献,允许定义一个高度灵活且自适应的估计器,能够处理不同平滑程度。我们提供了一种快速优化程序,适用于高斯和非高斯标量响应,并允许纳入标量协变量。此外,我们开发了针对函数分量和标量分量的推断程序,从而不仅增强了结果的可解释性,还提高了其在现实场景中的可用性。该方法应用于两组中红外光谱数据,在预测牛奶化学成分和奶牛日粮处理方面提供了优异的结果。此外,所开发的推断程序提供了相关见解,可能为更丰富的解释和更好地理解特定波长对牛奶特征的影响铺平道路。