The ability to learn compact, high-quality, and easy-to-optimize representations for visual data is paramount to many applications such as novel view synthesis and 3D reconstruction. Recent work has shown substantial success in using tensor networks to design such compact and high-quality representations. However, the ability to optimize tensor-based representations, and in particular, the highly compact tensor train representation, is still lacking. This has prevented practitioners from deploying the full potential of tensor networks for visual data. To this end, we propose 'Prolongation Upsampling Tensor Train (PuTT)', a novel method for learning tensor train representations in a coarse-to-fine manner. Our method involves the prolonging or `upsampling' of a learned tensor train representation, creating a sequence of 'coarse-to-fine' tensor trains that are incrementally refined. We evaluate our representation along three axes: (1). compression, (2). denoising capability, and (3). image completion capability. To assess these axes, we consider the tasks of image fitting, 3D fitting, and novel view synthesis, where our method shows an improved performance compared to state-of-the-art tensor-based methods. For full results see our project webpage: https://sebulo.github.io/PuTT_website/
翻译:学习紧凑、高质量且易于优化的视觉数据表示能力对于新颖视图合成和三维重建等众多应用至关重要。近期研究表明,利用张量网络设计此类紧凑且高质量的表示已取得显著成功。然而,优化基于张量的表示(特别是高度紧凑的张量列表示)的能力仍然不足,这阻碍了从业者充分发挥张量网络在视觉数据中的潜力。为此,我们提出"延拓上采样张量列(PuTT)",一种以粗到细方式学习张量列表示的新方法。该方法涉及对已学得的张量列表示进行延拓或"上采样",生成一系列逐步细化的"粗到细"张量列。我们从三个维度评估所提表示:(1)压缩性,(2)去噪能力,以及(3)图像补全能力。为评估这些维度,我们考虑了图像拟合、三维拟合和新颖视图合成任务,结果表明我们的方法相较于最先进的基于张量的方法性能更优。完整结果请参见我们的项目网页:https://sebulo.github.io/PuTT_website/