Coordinate networks are widely used in computer vision due to their ability to represent signals as compressed, continuous entities. However, training these networks with first-order optimizers can be slow, hindering their use in real-time applications. Recent works have opted for shallow voxel-based representations to achieve faster training, but this sacrifices memory efficiency. This work proposes a solution that leverages second-order optimization methods to significantly reduce training times for coordinate networks while maintaining their compressibility. Experiments demonstrate the effectiveness of this approach on various signal modalities, such as audio, images, videos, shape reconstruction, and neural radiance fields.
翻译:坐标网络因其能够将信号表示为压缩的连续实体而在计算机视觉领域得到广泛应用。然而,使用一阶优化器训练此类网络速度缓慢,制约了其在实时应用中的部署。近期研究多采用基于浅层体素的表达方式以加速训练,但这牺牲了存储效率。本文提出一种利用二阶优化方法的解决方案,在保持坐标网络可压缩性的同时显著缩短训练时间。实验结果表明,该方法在音频、图像、视频、形状重建及神经辐射场等多类信号模态上均展现出优异效果。