Multispectral (MS) snapshot cameras equipped with a MS filter array (MSFA), capture multiple spectral bands in a single shot, resulting in a raw mosaic image where each pixel holds only one channel value. The fully-defined MS image is estimated from the raw one through $\textit{demosaicing}$, which inevitably introduces spatio-spectral artifacts. Moreover, training on fully-defined MS images can be computationally intensive, particularly with deep neural networks (DNNs), and may result in features lacking discrimination power due to suboptimal learning of spatio-spectral interactions. Furthermore, outdoor MS image acquisition occurs under varying lighting conditions, leading to illumination-dependent features. This paper presents an original approach to learn discriminant and illumination-robust features directly from raw images. It involves: $\textit{raw spectral constancy}$ to mitigate the impact of illumination, $\textit{MSFA-preserving}$ transformations suited for raw image augmentation to train DNNs on diverse raw textures, and $\textit{raw-mixing}$ to capture discriminant spatio-spectral interactions in raw images. Experiments on MS image classification show that our approach outperforms both handcrafted and recent deep learning-based methods, while also requiring significantly less computational effort.
翻译:配备多光谱滤光阵列(MSFA)的多光谱快照相机可在单次拍摄中捕获多个光谱波段,生成原始马赛克图像,其中每个像素仅包含一个通道值。通过$\textit{去马赛克}$过程从原始图像估计出完整定义的多光谱图像,但这不可避免地会引入空间-光谱伪影。此外,在完整定义的多光谱图像上进行训练计算量巨大,尤其是在使用深度神经网络(DNNs)时,并且可能由于空间-光谱交互学习的次优性导致特征缺乏区分能力。再者,室外多光谱图像采集在变化的光照条件下进行,导致特征依赖于光照。本文提出了一种直接从原始图像中学习判别性和光照鲁棒特征的创新方法。该方法包括:$\textit{原始光谱恒常性}$以减轻光照影响,适用于原始图像增强的$\textit{MSFA保持}$变换以训练DNNs处理多样化的原始纹理,以及$\textit{原始混合}$以捕获原始图像中具有判别力的空间-光谱交互。多光谱图像分类实验表明,我们的方法在显著减少计算量的同时,性能优于手工设计和近期基于深度学习的方法。