Modeling long-range dependencies in sequential data is a crucial step in sequence learning. A recently developed model, the Structured State Space (S4), demonstrated significant effectiveness in modeling long-range sequences. However, It is unclear whether the success of S4 can be attributed to its intricate parameterization and HiPPO initialization or simply due to State Space Models (SSMs). To further investigate the potential of the deep SSMs, we start with exponential smoothing (ETS), a simple SSM, and propose a stacked architecture by directly incorporating it into an element-wise MLP. We augment simple ETS with additional parameters and complex field to reduce the inductive bias. Despite increasing less than 1\% of parameters of element-wise MLP, our models achieve comparable results to S4 on the LRA benchmark.
翻译:建模序列数据中的长程依赖是序列学习的关键步骤。近期提出的结构状态空间模型(S4)在长程序列建模中展现出显著效果。然而,尚不明确S4的成功究竟源于其精妙的参数化方法及HiPPO初始化,还是仅归功于状态空间模型(SSM)本身。为深入探究深度SSM的潜力,我们从指数平滑(ETS)这一简单SSM出发,通过将其直接融入逐元素MLP中,提出一种堆叠架构。我们通过引入额外参数与复数域增强简单ETS,以降低归纳偏置。尽管参数增量不足逐元素MLP参数的1%,我们的模型在LRA基准测试中仍取得了与S4相当的结果。