In this work, we present ThermoHands, a new benchmark for thermal image-based egocentric 3D hand pose estimation, aimed at overcoming challenges like varying lighting and obstructions (e.g., handwear). The benchmark includes a diverse dataset from 28 subjects performing hand-object and hand-virtual interactions, accurately annotated with 3D hand poses through an automated process. We introduce a bespoken baseline method, TheFormer, utilizing dual transformer modules for effective egocentric 3D hand pose estimation in thermal imagery. Our experimental results highlight TheFormer's leading performance and affirm thermal imaging's effectiveness in enabling robust 3D hand pose estimation in adverse conditions.
翻译:本文提出ThermoHands——一种基于热成像图像的第一视角3D手部姿态估计新基准,旨在克服光照变化及遮挡物(如手套等手部穿戴)带来的挑战。该基准包含来自28名受试者在手-物体交互和手-虚拟交互场景中的多样化数据集,通过自动化流程精确标注了3D手部姿态。我们提出了定制化基线方法TheFormer,利用双Transformer模块实现热成像图像中高效的第一视角3D手部姿态估计。实验结果凸显了TheFormer的领先性能,并验证了热成像技术在恶劣条件下实现鲁棒3D手部姿态估计的有效性。