The directed acyclic word graph (DAWG) of a string $y$ of length $n$ is the smallest (partial) DFA which recognizes all suffixes of $y$ with only $O(n)$ nodes and edges. In this paper, we show how to construct the DAWG for the input string $y$ from the suffix tree for $y$, in $O(n)$ time for integer alphabets of polynomial size in $n$. In so doing, we first describe a folklore algorithm which, given the suffix tree for $y$, constructs the DAWG for the reversed string of $y$ in $O(n)$ time. Then, we present our algorithm that builds the DAWG for $y$ in $O(n)$ time for integer alphabets, from the suffix tree for $y$. We also show that a straightforward modification to our DAWG construction algorithm leads to the first $O(n)$-time algorithm for constructing the affix tree of a given string $y$ over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. We then discuss how our constructions can lead to linear-time algorithms for building other text indexing structures, such as linear-size suffix tries and symmetric CDAWGs in linear time in the case of integer alphabets. As a further application to our $O(n)$-time DAWG construction algorithm, we show that the set $\mathsf{MAW}(y)$ of all minimal absent words (MAWs) of $y$ can be computed in optimal, input- and output-sensitive $O(n + |\mathsf{MAW}(y)|)$ time and $O(n)$ working space for integer alphabets.
翻译:有向无环词图(DAWG)是长度为$n$的字符串$y$的最小(部分)确定有限自动机,仅用$O(n)$个节点和边即可识别$y$的所有后缀。本文展示了如何从$y$的后缀树出发,在$O(n)$时间内为多项式大小(相对于$n$)的整数字母表构建DAWG。为此,我们首先描述了一个传统算法:给定$y$的后缀树,该算法可在$O(n)$时间内构造$y$反转字符串的DAWG。随后,我们提出了基于$y$后缀树、在$O(n)$时间内为整数字母表构建$y$的DAWG的算法。我们还证明,对DAWG构建算法进行直接修改,即可首次在$O(n)$时间内为整数字母表上的给定字符串$y$构建词缀树(affix tree)。词缀树是一种支持双向模式搜索的文本索引结构。接着我们讨论了这些构建方法如何导出一系列线性时间算法,从而在线性时间内构建其他文本索引结构,例如针对整数字母表的线性尺寸后缀trie和对称CDAWG。作为$O(n)$时间DAWG构建算法的进一步应用,我们证明了$y$的所有最小缺词(MAW)集合$\mathsf{MAW}(y)$可在输入和输出敏感的$O(n + |\mathsf{MAW}(y)|)$时间和$O(n)$工作空间内(针对整数字母表)最优地计算出来。
Alphabet is mostly a collection of companies. This newer Google is a bit slimmed down, with the companies that are pretty far afield of our main internet products contained in Alphabet instead.https://abc.xyz/