Basecalling, an essential step in many genome analysis studies, relies on large Deep Neural Networks (DNNs) to achieve high accuracy. Unfortunately, these DNNs are computationally slow and inefficient, leading to considerable delays and resource constraints in the sequence analysis process. A Computation-In-Memory (CIM) architecture using memristors can significantly accelerate the performance of DNNs. However, inherent device non-idealities and architectural limitations of such designs can greatly degrade the basecalling accuracy, which is critical for accurate genome analysis. To facilitate the adoption of memristor-based CIM designs for basecalling, it is important to (1) conduct a comprehensive analysis of potential CIM architectures and (2) develop effective strategies for mitigating the possible adverse effects of inherent device non-idealities and architectural limitations. This paper proposes Swordfish, a novel hardware/software co-design framework that can effectively address the two aforementioned issues. Swordfish incorporates seven circuit and device restrictions or non-idealities from characterized real memristor-based chips. Swordfish leverages various hardware/software co-design solutions to mitigate the basecalling accuracy loss due to such non-idealities. To demonstrate the effectiveness of Swordfish, we take Bonito, the state-of-the-art (i.e., accurate and fast), open-source basecaller as a case study. Our experimental results using Sword-fish show that a CIM architecture can realistically accelerate Bonito for a wide range of real datasets by an average of 25.7x, with an accuracy loss of 6.01%.
翻译:碱基识别是许多基因组分析研究的关键步骤,依赖于大型深度神经网络(DNN)以实现高精度。然而,这些DNN计算速度慢且效率低下,导致序列分析过程中出现显著延迟和资源限制。采用忆阻器的存内计算(CIM)架构可大幅提升DNN性能,但此类设计中固有的器件非理想特性与架构限制会严重降低碱基识别精度——这对精确的基因组分析至关重要。为促进基于忆阻器的CIM设计在碱基识别中的应用,需(1)对潜在CIM架构进行综合分析,(2)开发有效策略以缓解固有器件非理想特性与架构限制可能带来的不利影响。本文提出剑鱼(Swordfish),一种新型硬件/软件协同设计框架,可有效解决上述两个问题。该框架整合了来自真实忆阻器芯片的七类电路/器件限制或非理想特性,并利用多种软硬件协同设计方案来缓解因非理想特性导致的碱基识别精度损失。为验证剑鱼的有效性,我们以当前最先进(即兼具高精度与快速性)的开源碱基识别器Bonito为案例。使用剑鱼的实验结果表明,在真实数据集上,CIM架构平均可实现Bonito加速25.7倍,同时精度损失仅6.01%。