Aperiodicity, Star-freeness, and First-order Logic Definability of Structured Context-Free Languages

A classic result in formal language theory is the equivalence among non-counting, or aperiodic, regular languages, and languages defined through star-free regular expressions, or first-order logic. Past attempts to extend this result beyond the realm of regular languages have met with difficulties: for instance it is known that star-free tree languages may violate the non-counting property and there are aperiodic tree languages that cannot be defined through first-order logic. We extend such classic equivalence results to a significant family of deterministic context-free languages, the operator-precedence languages (OPL), which strictly includes the widely investigated visibly pushdown, alias input-driven, family and other structured context-free languages. The OP model originated in the '60s for defining programming languages and is still used by high performance compilers; its rich algebraic properties have been investigated initially in connection with grammar learning and recently completed with further closure properties and with monadic second order logic definition. We introduce an extension of regular expressions, the OP-expressions (OPE) which define the OPLs and, under the star-free hypothesis, define first-order definable and non-counting OPLs. Then, we prove, through a fairly articulated grammar transformation, that aperiodic OPLs are first-order definable. Thus, the classic equivalence of star-freeness, aperiodicity, and first-order definability is established for the large and powerful class of OPLs. We argue that the same approach can be exploited to obtain analogous results for visibly pushdown languages too.

翻译：形式语言理论中的一个经典结果是：非计数（即非周期性）正则语言与通过星自由正则表达式或一阶逻辑定义的语言等价。过去试图将此结果扩展到正则语言领域之外的尝试遭遇了困难：例如，已知星自由树语言可能违反非计数性质，并且存在无法通过一阶逻辑定义的非周期性树语言。我们将此类经典等价结果扩展到一类重要的确定性上下文无关语言——运算符优先级语言（OPL），该类语言严格包含了广泛研究的可视下推（即输入驱动）语言族及其他结构化上下文无关语言。OP模型起源于20世纪60年代，用于定义编程语言，并仍在高性能编译器中应用；其丰富的代数性质最初在语法学习背景下被研究，近年来通过进一步的闭包性质及一元二阶逻辑定义得以完善。我们引入了正则表达式的扩展形式——OP表达式（OPE），它定义了OPL，并在星自由假设下定义了可一阶逻辑定义和非计数的OPL。随后，通过精心设计的语法变换，我们证明了非周期性OPL是可一阶逻辑定义的。因此，对于庞大而强大的OPL类，我们建立了星自由性、非周期性与一阶逻辑可定义性之间的经典等价关系。我们认为，相同方法也可用于获取可视下推语言的类似结果。