Linear recurrent models, such as State Space Models (SSMs) and Linear Recurrent Units (LRUs), have recently shown state-of-the-art performance on long sequence modelling benchmarks. Despite their success, they come with a number of drawbacks, most notably their complex initialisation and normalisation schemes. In this work, we address some of these issues by proposing RotRNN -- a linear recurrent model which utilises the convenient properties of rotation matrices. We show that RotRNN provides a simple model with fewer theoretical assumptions than prior works, with a practical implementation that remains faithful to its theoretical derivation, achieving comparable scores to the LRU and SSMs on several long sequence modelling datasets.
翻译:线性循环模型,如状态空间模型(SSMs)和线性循环单元(LRUs),最近在长序列建模基准测试中展现了最先进的性能。尽管取得了成功,这些模型仍存在一些缺点,其中最突出的是其复杂的初始化和归一化方案。在本工作中,我们通过提出RotRNN——一种利用旋转矩阵便利特性的线性循环模型——来解决其中部分问题。我们证明,与先前工作相比,RotRNN提供了一个理论假设更少的简单模型,其实际实现忠实于理论推导,在多个长序列建模数据集上取得了与LRU和SSMs相当的性能分数。