The Burrows-Wheeler Transform (BWT) is an efficient invertible text transformation algorithm with the properties of tending to group identical characters together in a run, and enabling search of the text. This transformation has extensive uses particularly in lossless compression algorithms, indexing, and within bioinformatics for sequence alignment tasks. There has been recent interest in minimizing the number of identical character runs ($r$) for a transform and in finding useful alphabet orderings for the sorting step of the matrix associated with the BWT construction. This motivates the inspection of many transforms while developing algorithms. However, the full Burrows-Wheeler matrix is $O(n^2)$ space and therefore very difficult to display and inspect for large input sizes. In this paper we present a graphical user interface (GUI) for working with BWTs, which includes features for searching for matrix row prefixes, skipping over sections in the right-most column (the transform), and displaying BWTs while exploring alphabet orderings with the goal of minimizing the number of runs.
翻译:Burrows-Wheeler变换(BWT)是一种高效的可逆文本变换算法,具有将相同字符在游程中聚集的特性,并支持文本搜索。该变换在无损压缩算法、索引构建以及生物信息学序列比对任务中有着广泛用途。近年来,研究者关注如何最小化变换中相同字符游程的数量($r$),并寻找BWT构造过程中矩阵排序步骤的有效字母序。这促使我们在开发算法时需要检查大量变换结果。然而,完整的Burrows-Wheeler矩阵占用$O(n^2)$空间,导致大输入尺寸下难以显示和检查。本文提出一种用于处理BWT的图形用户界面(GUI),其功能包括:搜索矩阵行前缀、跳过最右列(变换结果)的连续区域,以及在探索以最小化游程数量为目标的字母序时显示BWT变换结果。