Backreferences and lookaheads are vital features to make classical regular expressions (REGEX) practical. Although these features have been widely used, understanding of the unrestricted combination of them has been limited. Practically, most likely no implementation fully supports them. Theoretically, while some studies have addressed these features separately, few have dared to combine them. In those few studies, it has been made clear that the amalgamation of these features renders REGEX significantly expressive. However, no acceptable expressivity bound for REWBLk$\unicode{x2014}$REGEX with backreferences and lookaheads$\unicode{x2014}$has been established. We elucidate this by establishing that REWBLk coincides with NLOG, the class of languages accepted by log-space nondeterministic Turing machines (NTMs). In translating REWBLk to log-space NTMs, negative lookaheads are the most challenging part since it essentially requires complementing log-space NTMs in nondeterministic log-space. To address this problem, we revisit Immerman$\unicode{x2013}$Szelepcs\'enyi theorem. In addition, we employ log-space nested-oracles NTMs to naturally handle nested lookaheads of REWBLk. Utilizing such oracle machines, we also present the new result that the membership problem of REWBLk is PSPACE-complete.
翻译:反向引用和前向搜索是使经典正则表达式(REGEX)具备实用性的关键特性。尽管这些特性已被广泛使用,但对它们无限制组合的理解仍十分有限。在实践中,大多数实现很可能无法完全支持这些特性。在理论上,虽然已有研究分别探讨了这些特性,但鲜有研究敢于将它们结合。在这些为数不多的研究中,已明确这些特性的融合使正则表达式具有极强的表达能力。然而,对于带反向引用和前向搜索的正则表达式REWBLk$\unicode{x2014}$REGEX with backreferences and lookaheads$\unicode{x2014}$,尚未建立可接受的可表达性边界。我们通过证明REWBLk与NLOG(对数空间非确定性图灵机(NTM)接受的语言类)相一致,阐明这一边界。在将REWBLk转换为对数空间NTM时,否定前向搜索是最具挑战的部分,因为它本质上需要在非确定性对数空间中实现对数空间NTM的补运算。为解决这一问题,我们重新审视了Immerman$\unicode{x2013}$Szelepcsényi定理。此外,我们采用对数空间嵌套预言机NTM来自然处理REWBLk的嵌套前向搜索。利用此类预言机,我们还提出一个新结果:REWBLk的成员判定问题是PSPACE完全的。