GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping

Julien Eudine,Chu Li,Zhuo Cheng,Renzo Andri,Can Firtina,Mohammad Sadrosadati,Nika Mansouri Ghiasi,Konstantina Koliogeorgi,Anirban Nag,Arash Tavakkol,Haiyu Mao,Onur Mutlu,Shai Bergman,Ji Zhang

Genome sequencing has become a central focus in computational biology. A genome study typically begins with sequencing, which produces millions to billions of short DNA fragments known as reads. Read mapping aligns these reads to a reference genome. Read mapping for short reads comes in two forms: single-end and paired-end, with the latter being more prevalent due to its higher accuracy and support for advanced analysis. Read mapping remains a major performance bottleneck in genome analysis due to expensive dynamic programming. Prior efforts have attempted to mitigate this cost by employing filters to identify and potentially discard computationally expensive matches and leveraging hardware accelerators to speed up the computations. While partially effective, these approaches have limitations. In particular, existing filters are often ineffective for paired-end reads, as they evaluate each read independently and exhibit relatively low filtering ratios. In this work, we propose GenPairX, a hardware-algorithm co-designed accelerator that efficiently minimizes the computational load of paired-end read mapping while enhancing the throughput of memory-intensive operations. GenPairX introduces: (1) a novel filtering algorithm that jointly considers both reads in a pair to improve filtering effectiveness, and a lightweight alignment algorithm to replace most of the computationally expensive dynamic programming operations, and (2) two specialized hardware mechanisms to support the proposed algorithms. Our evaluations show that GenPairX delivers substantial performance improvements over state-of-the-art solutions, achieving 1575x and 1.43x higher throughput per watt compared to leading CPU-based and accelerator-based read mappers, respectively, all without compromising accuracy.

翻译：基因组测序已成为计算生物学的核心焦点。基因组研究通常始于测序，该过程会产生数百万至数十亿个称为读段的短DNA片段。读段比对将这些读段与参考基因组进行比对。短读段比对分为单端和双端两种形式，后者因其更高的准确性和对高级分析的支持而更为普遍。由于昂贵的动态规划计算，读段比对仍然是基因组分析中的主要性能瓶颈。先前的研究尝试通过采用过滤器来识别并可能丢弃计算成本高的匹配，并利用硬件加速器来加速计算，以降低这一成本。这些方法虽部分有效，但仍存在局限性。特别是，现有过滤器对双端读段往往效果不佳，因为它们独立评估每个读段且过滤比率相对较低。在本研究中，我们提出了GenPairX，一种硬件-算法协同设计的加速器，能有效最小化双端读段比对的计算负载，同时提升内存密集型操作的吞吐量。GenPairX引入了：（1）一种新颖的过滤算法，该算法联合考虑一对中的两个读段以提高过滤效率，以及一种轻量级比对算法以替代大部分计算昂贵的动态规划操作；（2）两种专用硬件机制以支持所提出的算法。我们的评估表明，与最先进的解决方案相比，GenPairX实现了显著的性能提升，在保持准确性的前提下，相比领先的基于CPU和基于加速器的读段比对工具，分别实现了每瓦特吞吐量1575倍和1.43倍的提升。