Recommendation system has gained a large popularity for a variety of personalized suggestion tasks, but the ever-increasing number of user data makes real-time processing of recommendation systems difficult. NAND flash memory-based in-storage computing scheme can be one of favorable candidates among the various acceleration approaches because the flash memory typically has a larger memory capacity than the other memory types, so it can efficiently handle a large amount of user data for the recommendation inference services. However, different from other neural network applications where data is sequentially fetched from memory, the recommendation system shows the irregular random memory access pattern. Hence, most of the data loaded from the NAND flash array to the page buffer are not used, so a large portion of the internal bandwidth is underutilized, which degrades the performance on the inference acceleration of the recommendation tasks. In this paper, we propose RecFlash, a fast recommendation inference accelerator utilizing a data remapping algorithm with NAND flash-based in-storage computing (ISC). The experimental results show that our proposed method improves the latency and energy consumption by up to 81% and 91.9%, respectively, over the existing NAND flash-based ISC architecture.
翻译:推荐系统已广泛应用于各类个性化推荐任务,但用户数据量的持续增长使得推荐系统的实时处理面临挑战。基于NAND闪存的存储内计算方案可成为加速方法中的优选之一,因为闪存通常比其他存储类型具有更大的存储容量,能够高效处理推荐推理服务所需的海量用户数据。然而,与其他神经网络应用中数据按序从内存中读取不同,推荐系统呈现不规则的随机内存访问模式。因此,从NAND闪存阵列加载到页缓冲区的大部分数据未被使用,大量内部带宽被闲置,导致推荐任务推理加速性能下降。本文提出RecFlash——一种基于NAND闪存存储内计算(ISC)的数据重映射加速器,用于快速推荐推理。实验结果表明,与现有NAND闪存ISC架构相比,该方法可将延迟和能耗分别降低最高达81%和91.9%。