Dual-Branch Network for Portrait Image Quality Assessment

Portrait images typically consist of a salient person against diverse backgrounds. With the development of mobile devices and image processing techniques, users can conveniently capture portrait images anytime and anywhere. However, the quality of these portraits may suffer from the degradation caused by unfavorable environmental conditions, subpar photography techniques, and inferior capturing devices. In this paper, we introduce a dual-branch network for portrait image quality assessment (PIQA), which can effectively address how the salient person and the background of a portrait image influence its visual quality. Specifically, we utilize two backbone networks (\textit{i.e.,} Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it. To enhance the quality-aware feature representation of the backbones, we pre-train them on the large-scale video quality assessment dataset LSVQ and the large-scale facial image quality assessment dataset GFIQA. Additionally, we leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features. Finally, we concatenate these features and regress them into quality scores via a multi-perception layer (MLP). We employ the fidelity loss to train the model via a learning-to-rank manner to mitigate inconsistencies in quality scores in the portrait image quality assessment dataset PIQ. Experimental results demonstrate that the proposed model achieves superior performance in the PIQ dataset, validating its effectiveness. The code is available at \url{https://github.com/sunwei925/DN-PIQA.git}.

翻译：人像图像通常由显著人物与多样化背景组成。随着移动设备与图像处理技术的发展，用户可以随时随地便捷地拍摄人像图像。然而，这些图像的质量可能因不利环境条件、欠佳摄影技术及低质量拍摄设备导致的退化而受损。本文提出了一种用于人像图像质量评估（PIQA）的双分支网络，该网络能够有效解决显著人物与背景如何影响人像图像视觉质量的关键问题。具体而言，我们利用两个骨干网络（即Swin Transformer-B）分别从完整人像图像及其裁剪得到的面部图像中提取质量感知特征。为增强骨干网络的质量感知特征表示能力，我们在大规模视频质量评估数据集LSVQ和大规模面部图像质量评估数据集GFIQA上对其进行了预训练。此外，我们利用图像场景分类与质量评估模型LIQE捕获质量感知与场景特定特征作为辅助特征。最后，我们将这些特征进行拼接，并通过多层感知器（MLP）回归得到质量分数。我们采用保真度损失，以学习排序方式训练模型，以缓解人像图像质量评估数据集PIQ中质量分数的不一致性。实验结果表明，所提模型在PIQ数据集上取得了优越性能，验证了其有效性。代码发布于\url{https://github.com/sunwei925/DN-PIQA.git}。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日