The rapid advancement of artificial intelligence (AI) has been marked by the large language models exhibiting human-like intelligence. However, these models also present unprecedented challenges to energy consumption and environmental sustainability. One promising solution is to revisit analogue computing, a technique that predates digital computing and exploits emerging analogue electronic devices, such as resistive memory, which features in-memory computing, high scalability, and nonvolatility. However, analogue computing still faces the same challenges as before: programming nonidealities and expensive programming due to the underlying devices physics. Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning to optimize the topology of a randomly weighted analogue resistive memory neural network. Software-wise, the topology of a randomly weighted neural network is optimized by pruning connections rather than precisely tuning resistive memory weights. Hardware-wise, we reveal the physical origin of the programming stochasticity using transmission electron microscopy, which is leveraged for large-scale and low-cost implementation of an overparameterized random neural network containing high-performance sub-networks. We implemented the co-design on a 40nm 256K resistive memory macro, observing 17.3% and 19.9% accuracy improvements in image and audio classification on FashionMNIST and Spoken digits datasets, as well as 9.8% (2%) improvement in PR (ROC) in image segmentation on DRIVE datasets, respectively. This is accompanied by 82.1%, 51.2%, and 99.8% improvement in energy efficiency thanks to analogue in-memory computing. By embracing the intrinsic stochasticity and in-memory computing, this work may solve the biggest obstacle of analogue computing systems and thus unleash their immense potential for next-generation AI hardware.
翻译:人工智能(AI)的快速发展以展现类人智能的大语言模型为标志。然而,这些模型也对能源消耗和环境可持续性带来了前所未有的挑战。一个有前景的解决方案是重新审视模拟计算——一种早于数字计算的技术,它利用新兴的模拟电子器件(如电阻式存储器),其具有存内计算、高可扩展性和非易失性等特点。然而,模拟计算仍面临与以往相同的挑战:由于底层器件物理特性导致的编程非理想性和高昂的编程成本。在此,我们报告一种通用解决方案,即采用基于结构可塑性启发的边缘剪枝的软硬件协同设计,来优化随机加权的模拟电阻式存储器神经网络的拓扑结构。在软件方面,通过剪枝连接而非精确调整电阻式存储器权重,优化随机加权神经网络的拓扑结构。在硬件层面,我们利用透射电子显微镜揭示了编程随机性的物理起源,并将其用于大规模、低成本地实现一个包含高性能子网络的过参数化随机神经网络。我们在一个40nm 256K电阻式存储器宏单元上实现了该协同设计,在FashionMNIST和Spoken digits数据集上的图像和音频分类任务中分别提升了17.3%和19.9%的准确率,在DRIVE数据集上的图像分割任务中,PR(ROC)指标分别提升了9.8%(2%)。得益于模拟存内计算,能效相应提升了82.1%、51.2%和99.8%。通过利用内在随机性与存内计算,这项工作或可解决模拟计算系统最大的障碍,从而释放其在下一代AI硬件中的巨大潜力。