WSD: Wild Selfie Dataset for Face Recognition in Selfie Images

With the rise of handy smart phones in the recent years, the trend of capturing selfie images is observed. Hence efficient approaches are required to be developed for recognising faces in selfie images. Due to the short distance between the camera and face in selfie images, and the different visual effects offered by the selfie apps, face recognition becomes more challenging with existing approaches. A dataset is needed to be developed to encourage the study to recognize faces in selfie images. In order to alleviate this problem and to facilitate the research on selfie face images, we develop a challenging Wild Selfie Dataset (WSD) where the images are captured from the selfie cameras of different smart phones, unlike existing datasets where most of the images are captured in controlled environment. The WSD dataset contains 45,424 images from 42 individuals (i.e., 24 female and 18 male subjects), which are divided into 40,862 training and 4,562 test images. The average number of images per subject is 1,082 with minimum and maximum number of images for any subject are 518 and 2,634, respectively. The proposed dataset consists of several challenges, including but not limited to augmented reality filtering, mirrored images, occlusion, illumination, scale, expressions, view-point, aspect ratio, blur, partial faces, rotation, and alignment. We compare the proposed dataset with existing benchmark datasets in terms of different characteristics. The complexity of WSD dataset is also observed experimentally, where the performance of the existing state-of-the-art face recognition methods is poor on WSD dataset, compared to the existing datasets. Hence, the proposed WSD dataset opens up new challenges in the area of face recognition and can be beneficial to the community to study the specific challenges related to selfie images and develop improved methods for face recognition in selfie images.

翻译：随着近年来便携式智能手机的普及，自拍图像拍摄趋势日益明显。因此，需要开发高效的方法来识别自拍图像中的人脸。由于自拍图像中相机与人脸之间的距离较短，且自拍应用程序提供的视觉效果各异，现有方法在自拍人脸识别中面临更大挑战。亟需构建一个数据集以推动自拍图像人脸识别研究。为解决此问题并促进自拍人脸图像研究，我们构建了具有挑战性的野外自拍数据集（Wild Selfie Dataset, WSD）。与现有大多数在受控环境下采集的图像数据集不同，WSD中的图像均来自不同智能手机的前置摄像头。该数据集包含42个个体（24名女性与18名男性）的45,424张图像，其中40,862张用于训练，4,562张用于测试。每位受试者平均拥有1,082张图像，最少与最多图像数分别为518张与2,634张。本数据集涵盖多项挑战，包括（但不限于）增强现实滤镜、镜像图像、遮挡、光照、尺度、表情、视角、宽高比、模糊、部分人脸、旋转及对齐等技术难点。我们从不同特征维度将本数据集与现有基准数据集进行比较。实验观察表明，WSD数据集具有复杂性，现有最先进人脸识别方法在该数据集上的表现显著差于现有其他数据集。因此，WSD数据集为人脸识别领域开拓了新的挑战，将助力学界研究自拍图像相关特定难题，并开发面向自拍图像人脸识别的改进方法。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日