Avatar creation from human images allows users to customize their digital figures in different styles. Existing rendering systems like Bitmoji, MetaHuman, and Google Cartoonset provide expressive rendering systems that serve as excellent design tools for users. However, twenty-plus parameters, some including hundreds of options, must be tuned to achieve ideal results. Thus it is challenging for users to create the perfect avatar. A machine learning model could be trained to predict avatars from images, however the annotators who label pairwise training data have the same difficulty as users, causing high label noise. In addition, each new rendering system or version update requires thousands of new training pairs. In this paper, we propose a Tag-based annotation method for avatar creation. Compared to direct annotation of labels, the proposed method: produces higher annotator agreements, causes machine learning to generates more consistent predictions, and only requires a marginal cost to add new rendering systems.
翻译:从人体图像创建虚拟形象允许用户以不同风格定制其数字形象。现有的渲染系统(如Bitmoji、MetaHuman和Google Cartoonset)提供了富有表现力的渲染工具,成为用户优秀的设计平台。然而,用户需要调节二十多个参数(部分参数包含数百种选项)才能获得理想效果。因此,用户创建完美虚拟形象具有挑战性。虽然可以训练机器学习模型从图像预测虚拟形象,但标注配对训练数据的标注员面临与用户相同的困难,导致标注噪声较高。此外,每个新渲染系统或版本更新都需要数千个新训练配对。本文提出一种基于标签的虚拟形象创建标注方法。与直接标注标签相比,该方法能产生更高的标注者一致性,使机器学习生成更一致的预测结果,且添加新渲染系统仅需边际成本。