3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather. Complementary, mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather. However, combining RGB and mmWave signals for robust all-weather 3D human reconstruction is still an open challenge, given the sparse nature of mmWave and the vulnerability of RGB images. In this paper, we present ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies in all weather conditions robustly. Specifically, our ImmFusion consists of image and point backbones for token feature extraction and a Transformer module for token fusion. The image and point backbones refine global and local features from original data, and the Fusion Transformer Module aims for effective information fusion of two modalities by dynamically selecting informative tokens. Extensive experiments on a large-scale dataset, mmBody, captured in various environments demonstrate that ImmFusion can efficiently utilize the information of two modalities to achieve a robust 3D human body reconstruction in all weather conditions. In addition, our method's accuracy is significantly superior to that of state-of-the-art Transformer-based LiDAR-camera fusion methods.
翻译:基于可见光图像的三维人体重建在良好天气条件下表现良好,但在恶劣天气下性能会大幅下降。作为补充,毫米波雷达已被用于在恶劣天气下重建三维人体关节点与网格。然而,鉴于毫米波信号的稀疏性与可见光图像的脆弱性,如何结合二者实现鲁棒的全天气三维人体重建仍是一项开放性挑战。本文提出ImmFusion——首个面向全天气条件下鲁棒三维人体重建的毫米波-可见光融合方案。具体而言,ImmFusion由用于令牌特征提取的图像主干网络与点云主干网络,以及用于令牌融合的Transformer模块构成。其中,图像主干与点云主干分别从原始数据中提炼全局与局部特征,而融合Transformer模块通过动态选择信息性令牌实现两种模态信息的高效融合。在覆盖多种环境的毫米波三维人体数据集mmBody上进行的大量实验表明,ImmFusion能高效利用双模态信息,在全天气条件下实现鲁棒的三维人体重建。此外,本方法的精度显著优于现有最优的基于Transformer的激光雷达-相机融合方法。