We introduce VIVE3D, a novel approach that extends the capabilities of image-based 3D GANs to video editing and is able to represent the input video in an identity-preserving and temporally consistent way. We propose two new building blocks. First, we introduce a novel GAN inversion technique specifically tailored to 3D GANs by jointly embedding multiple frames and optimizing for the camera parameters. Second, besides traditional semantic face edits (e.g. for age and expression), we are the first to demonstrate edits that show novel views of the head enabled by the inherent properties of 3D GANs and our optical flow-guided compositing technique to combine the head with the background video. Our experiments demonstrate that VIVE3D generates high-fidelity face edits at consistent quality from a range of camera viewpoints which are composited with the original video in a temporally and spatially consistent manner.
翻译:我们提出VIVE3D,一种将基于图像的3D GAN能力扩展到视频编辑的新方法,能够以身份保持和时序一致的方式表示输入视频。我们提出两个新构建模块。首先,引入一种专为3D GAN定制的GAN逆映射技术,通过联合嵌入多帧并优化相机参数实现。其次,除传统的语义人脸编辑(如年龄和表情)外,我们首次展示了利用3D GAN固有属性及光流引导合成技术(将头部与背景视频结合)实现的新视角头部编辑。实验表明,VIVE3D能在多种相机视角下生成高质量人脸编辑,并以时空一致的方式与原始视频合成。