Multiple algorithms are known for efficiently calculating the prefix probability of a string under a probabilistic context-free grammar (PCFG). Good algorithms for the problem have a runtime cubic in the length of the input string. However, some proposed algorithms are suboptimal with respect to the size of the grammar. This paper proposes a novel speed-up of Jelinek and Lafferty's (1991) algorithm, whose original runtime is $O(n^3 |N|^3 + |N|^4)$, where $n$ is the input length and $|N|$ is the number of non-terminals in the grammar. In contrast, our speed-up runs in $O(n^2 |N|^3+n^3|N|^2)$.
翻译:多种算法已知可用于高效计算概率上下文无关文法(PCFG)下字符串的前缀概率。解决该问题的优秀算法的运行时间与输入字符串长度的立方成正比。然而,某些已提出的算法在文法规模方面并非最优。本文提出了一种对Jelinek和Lafferty(1991)算法的加速改进,其原始运行时间为$O(n^3 |N|^3 + |N|^4)$,其中$n$为输入长度,$|N|$为文法中非终结符的数量。相比之下,我们的加速算法运行时间为$O(n^2 |N|^3+n^3|N|^2)$。