Inverse rendering is an ill-posed problem, but priors like illumination priors, can simplify it. Existing work either disregards the spherical and rotation-equivariant nature of illumination environments or does not provide a well-behaved latent space. We propose a rotation-equivariant variational autoencoder that models natural illumination on the sphere without relying on 2D projections. To preserve the SO(2)-equivariance of environment maps, we use a novel Vector Neuron Vision Transformer (VN-ViT) as encoder and a rotation-equivariant conditional neural field as decoder. In the encoder, we reduce the equivariance from SO(3) to SO(2) using a novel SO(2)-equivariant fully connected layer, an extension of Vector Neurons. We show that our SO(2)-equivariant fully connected layer outperforms standard Vector Neurons when used in our SO(2)-equivariant model. Compared to previous methods, our variational autoencoder enables smoother interpolation in latent space and offers a more well-behaved latent space.
翻译:逆渲染是一个不适定问题,但光照先验等先验知识可以简化该问题。现有研究要么忽略了光照环境的球面与旋转等变性本质,要么未能提供性质良好的潜在空间。我们提出了一种旋转等变变分自编码器,可在球面上对自然光照进行建模,而无需依赖二维投影。为保持环境贴图的SO(2)等变性,我们采用新型向量神经元视觉Transformer(VN-ViT)作为编码器,并采用旋转等变条件神经场作为解码器。在编码器中,我们通过新型SO(2)等变全连接层(向量神经元的扩展形式)将等变性从SO(3)降至SO(2)。实验表明,在我们的SO(2)等变模型中,所提出的SO(2)等变全连接层性能优于标准向量神经元。与现有方法相比,我们的变分自编码器能够在潜在空间实现更平滑的插值,并提供性质更优的潜在空间。