We propose a novel non-parametric/un-trainable language model, named Non-Parametric Pairwise Attention Random Walk Model (NoPPA), to generate sentence embedding only with pre-trained word embedding and pre-counted word frequency. To the best we know, this study is the first successful attempt to break the constraint on bag-of-words assumption with a non-parametric attention mechanism. We evaluate our method on eight different downstream classification tasks. The experiment results show that NoPPA outperforms all kinds of bag-of-words-based methods in each dataset and provides a comparable or better performance than the state-of-the-art non-parametric methods on average. Furthermore, visualization supports that NoPPA can understand contextual topics, common phrases, and word causalities. Our model is available at https://github.com/JacksonWuxs/NoPPA.
翻译:我们提出了一种新颖的非参数化/不可训练语言模型,称为非参数化成对注意力随机游走模型(NoPPA),该模型仅利用预训练的词嵌入和预计算的词频即可生成句子嵌入。据我们所知,这项研究是首次成功通过非参数化注意力机制突破词袋假设约束的尝试。我们在八个不同的下游分类任务上评估了该方法。实验结果表明,NoPPA在每个数据集上都优于所有基于词袋的方法,并且平均性能可媲美或超越最先进的非参数化方法。此外,可视化结果支持NoPPA能够理解上下文主题、常见短语及词语间的因果关系。我们的模型代码可从 https://github.com/JacksonWuxs/NoPPA 获取。