Fourier features based positional encoding (PE) is commonly used in machine learning tasks that involve learning high-frequency features from low-dimensional inputs, such as 3D view synthesis and time series regression with neural tangent kernels. Despite their effectiveness, existing PEs require manual, empirical adjustment of crucial hyperparameters, specifically the Fourier features, tailored to each unique task. Further, PEs face challenges in efficiently learning high-frequency functions, particularly in tasks with limited data. In this paper, we introduce sinusoidal PE (SPE), designed to efficiently learn adaptive frequency features closely aligned with the true underlying function. Our experiments demonstrate that SPE, without hyperparameter tuning, consistently achieves enhanced fidelity and faster training across various tasks, including 3D view synthesis, Text-to-Speech generation, and 1D regression. SPE is implemented as a direct replacement for existing PEs. Its plug-and-play nature lets numerous tasks easily adopt and benefit from SPE.
翻译:基于傅里叶特征的位置编码(PE)通常用于涉及从低维输入(例如3D视图合成和基于神经正切核的时间序列回归)中学习高频特征的机器学习任务。尽管其有效,但现有的PE需要针对每个独特任务手动、经验性地调整关键超参数,特别是傅里叶特征。此外,PE在高效学习高频函数方面面临挑战,尤其是在数据有限的任务中。在本文中,我们引入了正弦位置编码(SPE),旨在高效学习与真实底层函数紧密对齐的自适应频率特征。我们的实验表明,无需超参数调优的SPE,在包括3D视图合成、文本到语音生成和1D回归在内的各种任务中,始终能实现更高的保真度和更快的训练速度。SPE被实现为现有PE的直接替代品。其即插即用的特性使得众多任务可以轻松采用并受益于SPE。