User-curated item lists, such as video-based playlists on Youtube and book-based lists on Goodreads, have become prevalent for content sharing on online platforms. Item list continuation is proposed to model the overall trend of a list and predict subsequent items. Recently, Transformer-based models have shown promise in comprehending contextual information and capturing item relationships in a list. However, deploying them in real-time industrial applications is challenging, mainly because the autoregressive generation mechanism used in them is time-consuming. In this paper, we propose a novel fast non-autoregressive sequence generation model, namely FANS, to enhance inference efficiency and quality for item list continuation. First, we use a non-autoregressive generation mechanism to decode next $K$ items simultaneously instead of one by one in existing models. Then, we design a two-stage classifier to replace the vanilla classifier used in current transformer-based models to further reduce the decoding time. Moreover, to improve the quality of non-autoregressive generation, we employ a curriculum learning strategy to optimize training. Experimental results on four real-world item list continuation datasets including Zhihu, Spotify, AotM, and Goodreads show that our FANS model can significantly improve inference efficiency (up to 8.7x) while achieving competitive or better generation quality for item list continuation compared with the state-of-the-art autoregressive models. We also validate the efficiency of FANS in an industrial setting. Our source code and data will be available at MindSpore/models and Github.
翻译:用户策划的项目列表(如YouTube上的视频播放列表和Goodreads上的书籍列表)已成为在线平台上内容共享的普遍形式。项目列表延续旨在建模列表的整体趋势并预测后续项目。近年来,基于Transformer的模型在理解上下文信息与捕获列表项目关系方面展现出潜力。然而,将其部署到实时工业应用中仍面临挑战,主要因其采用的自回归生成机制耗时严重。本文提出一种新型快速非自回归序列生成模型FANS,以提升项目列表延续的推理效率与质量。首先,我们采用非自回归生成机制并行解码后续K个项目(而非现有模型逐项生成)。其次,设计两阶段分类器替代当前Transformer模型中的标准分类器,进一步缩减解码时间。此外,为提升非自回归生成质量,我们引入课程学习策略优化训练过程。在知乎、Spotify、AotM和Goodreads四个真实项目列表延续数据集上的实验表明:与最先进的自回归模型相比,FANS模型能在显著提升推理效率(最高达8.7倍)的同时,实现具有竞争力甚至更优的生成质量。我们还验证了FANS在工业环境中的有效性。源代码与数据将在MindSpore/models与Github上开源。