In our generation, there is an undoubted rise in the use of social media and specifically photo and video sharing platforms. These sites have proved their ability to yield rich data sets through the users' interaction which can be used to perform a data-driven evaluation of capabilities. Nevertheless, this study reveals the lack of suitable data sets in photo and video sharing platforms and evaluation processes across them. In this way, our first contribution is the creation of one of the largest labelled data sets in Flickr with the multimodal data which has been open sourced as part of this contribution. Predicated on these data, we explored machine learning models and concluded that it is feasible to properly predict whether a user is a professional photographer or not based on self-reported occupation labels and several feature representations out of the user, photo and crowdsourced sets. We also examined the relationship between the aesthetics and technical quality of a picture and the social activity of that picture. Finally, we depicted which characteristics differentiate professional photographers from non-professionals. As far as we know, the results presented in this work represent an important novelty for the users' expertise identification which researchers from various domains can use for different applications.
翻译:在我们的时代,社交媒体特别是照片和视频分享平台的使用无疑在增长。这些网站已证明其通过用户互动产生丰富数据集的能力,这些数据可用于进行数据驱动的能力评估。然而,本研究揭示了照片和视频分享平台中缺乏合适的数据集及其评估过程。为此,我们的第一个贡献是创建了Flickr中最大的标注数据集之一,该数据集包含多模态数据,并作为本贡献的一部分开源发布。基于这些数据,我们探索了机器学习模型,并得出结论:根据自我报告的职业标签以及用户、照片和众包集的多种特征表示,预测用户是否为专业摄影师是可行的。我们还研究了图片的美学和技术质量与图片社交活动之间的关系。最后,我们揭示了区分专业摄影师与非专业摄影师的特征。据我们所知,本工作呈现的结果对用户专业能力识别具有重要创新性,不同领域的研究人员可将其用于各种应用。