MONI (Rossi et al., {\it JCB} 2022) is a BWT-based compressed index for computing the matching statistics and maximal exact matches (MEMs) of a pattern (usually a DNA read) with respect to a highly repetitive text (usually a database of genomes) using two operations: LF-steps and longest common extension (LCE) queries on a grammar-compressed representation of the text. In practice, most of the operations are constant-time LF-steps but most of the time is spent evaluating LCE queries. In this paper we show how (a variant of) the latter can be evaluated lazily, so as to bound the total time MONI needs to process the pattern in terms of the number of MEMs between the pattern and the text, while maintaining logarithmic latency.
翻译:MONI(Rossi等人,《JCB》2022)是一种基于BWT的压缩索引,通过两种操作——LF步和基于文本的语法压缩表示上的最长公共扩展(LCE)查询——计算模式(通常为DNA读取序列)与高度重复文本(通常为基因组数据库)之间的匹配统计量和最大精确匹配(MEMs)。实际应用中,大多数操作为常数时间的LF步,但大部分时间消耗在评估LCE查询上。本文展示了如何对LCE查询(的变体)进行惰性评估,从而在保持对数延迟的同时,将MONI处理模式所需的总时间限制为模式与文本之间MEM数量相关。