In this paper, we propose a multi-task representation learning framework to jointly estimate the identity, gender and age of individuals from their hand images for the purpose of criminal investigations since the hand images are often the only available information in cases of serious crime such as sexual abuse. We investigate different up-to-date deep learning architectures and compare their performance for joint estimation of identity, gender and age from hand images of perpetrators of serious crime. To overcome the data imbalance and simplify the age prediction, we create age groups for the age estimation. We make extensive evaluations and comparisons of both convolution-based and transformer-based deep learning architectures on a publicly available 11k hands dataset. Our experimental analysis shows that it is possible to efficiently estimate not only identity but also other attributes such as gender and age of suspects jointly from hand images for criminal investigations, which is crucial in assisting international police forces in the court to identify and convict abusers.
翻译:本文提出一种多任务表示学习框架,旨在通过手部图像联合估计个体的身份、性别和年龄,以服务于刑事侦查——因为在性虐待等严重犯罪案件中,手部图像往往是唯一可获取的信息。我们研究了多种当前最先进的深度学习架构,并比较了它们在严重犯罪作案者手部图像上联合估计身份、性别与年龄的性能。为克服数据不平衡问题并简化年龄预测,我们针对年龄估计构建了年龄分组。我们在公开的11k手部数据集上,对基于卷积和基于Transformer的深度学习架构进行了广泛评估与比较。实验分析表明,在刑事侦查中,不仅能够高效地从手部图像估计出嫌疑人的身份,还能联合估计性别和年龄等属性,这对于协助国际警方在法庭上识别并定罪施暴者具有关键意义。