RGB-D And Thermal Sensor Fusion: A Systematic Literature Review

In the last decade, the computer vision field has seen significant progress in multimodal data fusion and learning, where multiple sensors, including depth, infrared, and visual, are used to capture the environment across diverse spectral ranges. Despite these advancements, there has been no systematic and comprehensive evaluation of fusing RGB-D and thermal modalities to date. While autonomous driving using LiDAR, radar, RGB, and other sensors has garnered substantial research interest, along with the fusion of RGB and depth modalities, the integration of thermal cameras and, specifically, the fusion of RGB-D and thermal data, has received comparatively less attention. This might be partly due to the limited number of publicly available datasets for such applications. This paper provides a comprehensive review of both, state-of-the-art and traditional methods used in fusing RGB-D and thermal camera data for various applications, such as site inspection, human tracking, fault detection, and others. The reviewed literature has been categorised into technical areas, such as 3D reconstruction, segmentation, object detection, available datasets, and other related topics. Following a brief introduction and an overview of the methodology, the study delves into calibration and registration techniques, then examines thermal visualisation and 3D reconstruction, before discussing the application of classic feature-based techniques as well as modern deep learning approaches. The paper concludes with a discourse on current limitations and potential future research directions. It is hoped that this survey will serve as a valuable reference for researchers looking to familiarise themselves with the latest advancements and contribute to the RGB-DT research field.

翻译：过去十年中，计算机视觉领域在多模态数据融合与学习方面取得了显著进展，其中利用包括深度、红外和视觉在内的多种传感器跨越不同光谱范围来感知环境。尽管取得了这些进展，但迄今为止尚未有关于RGB-D与热模态融合的系统性全面评估。虽然基于激光雷达、雷达、RGB及其他传感器的自动驾驶技术已引起广泛研究兴趣，且RGB与深度模态的融合也得到了关注，但热相机的集成、特别是RGB-D与热数据的融合却相对受到较少关注。这在一定程度上可能归因于此类应用缺乏公开可用的数据集。本文对用于各种应用（如现场检测、人体追踪、故障检测等）的RGB-D与热相机数据融合方法（包括最先进方法和传统方法）进行了全面综述。所综述的文献按技术领域分类，例如三维重建、分割、目标检测、可用数据集及其他相关主题。在简要引言和方法论概述之后，本研究深入探讨了校准与配准技术，随后考察了热成像可视化与三维重建，并讨论了经典基于特征的方法以及现代深度学习方法的应用。本文最后讨论了当前局限性和未来潜在研究方向。本综述有望为希望了解最新进展并贡献于RGB-DT研究领域的研究人员提供有价值的参考。