A multi-layer perceptron (MLP) is a type of neural networks which has a long history of research and has been studied actively recently in computer vision and graphics fields. One of the well-known problems of an MLP is the capability of expressing high-frequency signals from low-dimensional inputs. There are several studies for input encodings to improve the reconstruction quality of an MLP by applying pre-processing against the input data. This paper proposes a novel input encoding method, local positional encoding, which is an extension of positional and grid encodings. Our proposed method combines these two encoding techniques so that a small MLP learns high-frequency signals by using positional encoding with fewer frequencies under the lower resolution of the grid to consider the local position and scale in each grid cell. We demonstrate the effectiveness of our proposed method by applying it to common 2D and 3D regression tasks where it shows higher-quality results compared to positional and grid encodings, and comparable results to hierarchical variants of grid encoding such as multi-resolution grid encoding with equivalent memory footprint.
翻译:多层感知机(MLP)是一种历史悠久且近期在计算机视觉与图形学领域被广泛研究的神经网络类型。MLP的已知问题之一是从低维输入中表达高频信号的能力受限。已有若干针对输入编码的研究,通过对输入数据进行预处理来提升MLP的重建质量。本文提出一种新颖的输入编码方法——局部位置编码,该方法是位置编码与网格编码的扩展。我们提出的方法将这两种编码技术相结合,使得小型MLP能够在较低网格分辨率下,利用更少频率的位置编码来学习高频信号,以考虑每个网格单元内的局部位置和尺度。通过将所提方法应用于常见的二维和三维回归任务,我们验证了其有效性:相较于位置编码和网格编码,该方法展现出更高质量的结果;同时,与等效内存占用的分层网格编码变体(如多分辨率网格编码)相比,其结果具有可比性。