FaceChain: A Playground for Identity-Preserving Portrait Generation

Yang Liu,Cheng Yu,Lei Shang,Ziheng Wu,Xingjun Wang,Yuze Zhao,Lin Zhu,Chen Cheng,Weitao Chen,Chao Xu,Haoyu Xie,Yuan Yao,Wenmeng Zhou,Yingda Chen,Xuansong Xie,Baigui Sun

from arxiv, This is an ongoing work that will be consistently refined and improved upon

Recent advancement in personalized image generation have unveiled the intriguing capability of pre-trained text-to-image models on learning identity information from a collection of portrait images. However, existing solutions can be vulnerable in producing truthful details, and usually suffer from several defects such as (i) The generated face exhibit its own unique characteristics, \ie facial shape and facial feature positioning may not resemble key characteristics of the input, and (ii) The synthesized face may contain warped, blurred or corrupted regions. In this paper, we present FaceChain, a personalized portrait generation framework that combines a series of customized image-generation model and a rich set of face-related perceptual understanding models (\eg, face detection, deep face embedding extraction, and facial attribute recognition), to tackle aforementioned challenges and to generate truthful personalized portraits, with only a handful of portrait images as input. Concretely, we inject several SOTA face models into the generation procedure, achieving a more efficient label-tagging, data-processing, and model post-processing compared to previous solutions, such as DreamBooth ~\cite{ruiz2023dreambooth} , InstantBooth ~\cite{shi2023instantbooth} , or other LoRA-only approaches ~\cite{hu2021lora} . Through the development of FaceChain, we have identified several potential directions to accelerate development of Face/Human-Centric AIGC research and application. We have designed FaceChain as a framework comprised of pluggable components that can be easily adjusted to accommodate different styles and personalized needs. We hope it can grow to serve the burgeoning needs from the communities. FaceChain is open-sourced under Apache-2.0 license at \url{https://github.com/modelscope/facechain}.

翻译：近年来个性化图像生成技术的进步揭示了预训练文本到图像模型在学习肖像图像集合中身份信息方面的惊人能力。然而现有解决方案在生成真实细节时存在脆弱性，常出现以下缺陷：(i)生成的人脸具有独有特征，即面部形状和面部特征定位可能与输入的关键特征不相似；(ii)合成的人脸可能出现扭曲、模糊或损坏区域。本文提出FaceChain——一种结合定制化图像生成模型与丰富面部感知理解模型（如人脸检测、深度人脸嵌入提取及面部属性识别）的个性化肖像生成框架，以解决上述挑战并实现仅需少量肖像图像输入即可生成真实的个性化肖像。具体而言，我们将多个SOTA人脸模型注入生成流程，相较于DreamBooth、InstantBooth或其他仅使用LoRA的方法等先前方案，实现了更高效的标签标注、数据处理和模型后处理。通过FaceChain的开发，我们识别出若干加速以人脸/人为中心的AIGC研究与应用的潜在方向。我们设计FaceChain为可插拔组件框架，可灵活调整以适配不同风格与个性化需求，期望其能服务日益增长的社区需求。FaceChain已在Apache-2.0许可下开源（\url{https://github.com/modelscope/facechain}）。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日