Sorting is fundamental and ubiquitous in modern computing systems. Hardware sorting systems are built based on comparison operations with Von Neumann architecture, but their performance are limited by the bandwidth between memory and comparison units and the performance of complementary metal-oxide-semiconductor (CMOS) based circuitry. Sort-in-memory (SIM) based on emerging memristors is desired but not yet available due to comparison operations that are challenging to be implemented within memristive memory. Here we report fast and reconfigurable SIM system enabled by digit read (DR) on 1-transistor-1-resistor (1T1R) memristor arrays. We develop DR tree node skipping (TNS) that support variable data quantity and data types, and extend TNS with multi-bank, bit-slice and multi-level strategies to enable cross-array TNS (CA-TNS) for practical adoptions. Experimented on benchmark sorting datasets, our memristor-enabled SIM system presents up to 3.32x~7.70x speedup, 6.23x~183.5x energy efficiency improvement and 2.23x~7.43x area reduction compared with state-of-the-art sorting systems. We apply such SIM system for shortest path search with Dijkstra's algorithm and neural network inference with in-situ pruning, demonstrating the capability in solving practical sorting tasks and the compatibility in integrating with other compute-in-memory (CIM) schemes. The comparison-free TNS/CA-TNS SIM enabled by memristors pushes sorting into a new paradigm of sort-in-memory for next-generation sorting systems.
翻译:排序是现代计算系统中基础且无处不在的操作。基于冯·诺依曼架构的硬件排序系统依靠比较操作构建,但其性能受限于存储单元与比较单元间的带宽以及基于互补金属氧化物半导体(CMOS)电路的性能。基于新兴忆阻器的存内排序(SIM)备受期待,但由于比较操作难以在忆阻存储器中实现而尚未实现。本文报道了一种基于1T1R忆阻器阵列的数字读取(DR)技术所实现的快速可重构存内排序系统。我们提出了支持可变数据量及数据类型的DR树节点跳跃(TNS)算法,并采用多存储体、位切片和多层级策略扩展了TNS,从而实现了面向实际应用的跨阵列TNS(CA-TNS)。在基准排序数据集上的实验表明,与最先进的排序系统相比,我们的忆阻器驱动的存内排序系统在速度上提升了3.32至7.70倍,能效提升了6.23至183.5倍,面积减少了2.23至7.43倍。我们将该存内排序系统应用于Dijkstra算法的最短路径搜索以及带有原位剪枝的神经网络推理,展示了其在解决实际排序任务中的能力以及与其他存内计算(CIM)方案集成的兼容性。这种基于忆阻器的无比较TNS/CA-TNS存内排序技术将排序系统推向存内排序的新范式,为下一代排序系统奠定基础。