Efficiently capturing the long-range patterns in sequential data sources salient to a given task -- such as classification and generative modeling -- poses a fundamental challenge. Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural networks, or the parameter burden of convolutional networks with many or large filters. We instead take inspiration from wavelet-based multiresolution analysis to define a new building block for sequence modeling, which we call a MultiresLayer. The key component of our model is the multiresolution convolution, capturing multiscale trends in the input sequence. Our MultiresConv can be implemented with shared filters across a dilated causal convolution tree. Thus it garners the computational advantages of convolutional networks and the principled theoretical motivation of wavelet decompositions. Our MultiresLayer is straightforward to implement, requires significantly fewer parameters, and maintains at most a $\mathcal{O}(N\log N)$ memory footprint for a length $N$ sequence. Yet, by stacking such layers, our model yields state-of-the-art performance on a number of sequence classification and autoregressive density estimation tasks using CIFAR-10, ListOps, and PTB-XL datasets.
翻译:高效捕捉序列数据源中与特定任务(如分类和生成建模)相关的长程模式是一项根本性挑战。目前主流方法需要在不同方面进行权衡:Transformer中的暴力枚举与比较带来记忆负担,循环神经网络中复杂的序列依赖关系导致计算负担,或者使用大量或大尺寸滤波器的卷积网络带来参数负担。为此,我们从基于小波的多分辨率分析中汲取灵感,定义了一种新的序列建模基础模块——多分辨率层(MultiresLayer)。该模型的核心组件是多分辨率卷积(Multiresolution Conv),能够捕捉输入序列中的多尺度趋势。我们的多分辨率卷积可通过膨胀因果卷积树使用共享滤波器实现,从而兼具卷积网络的计算优势和小波分解的理论动机。该多分辨率层实现简单,所需参数显著减少,且对于长度为$N$的序列,其内存占用至多为$\mathcal{O}(N\log N)$。通过堆叠此类层,我们的模型在CIFAR-10、ListOps和PTB-XL数据集上的多个序列分类和自回归密度估计任务中均取得了最先进的性能。