Recently, a new paradigm was introduced in automata theory. The main idea is to classify regular languages according to their propensity to be sorted, establishing a deep connection between automata theory and data compression [J. ACM 2023]. This parameterization leads to two hierarchies of regular languages: a deterministic hierarchy and a non-deterministic hierarchy. While the deterministic hierarchy is well understood, the non-deterministic hierarchy appears much more complex. This is true even for the richest and most studied level of the hierarchies, corresponding to the class of Wheeler languages. In this paper, we study Wheeler language through the lens of bisimulations. We first show that the standard notion of bisimulation is not appropriate. Then, we introduce Wheeler bisimulations, that is, bisimulations that respect the convex structure of the considered Wheeler automata. Although there are some differences between the properties of bisimulations and the properties of Wheeler bisimulations, we show that Wheeler bisimulations induce a unique minimal Wheeler NFA (analogously to standard bisimulations). In particular, in the deterministic case, we retrieve the minimum Wheeler deterministic automaton of a given language. We also show that the minimal Wheeler NFA induced by Wheeler bisimulations can be built in linear time. This is in contrast with standard bisimulations, for which the corresponding minimal NFA can be built in $ O(m \log n) $ time (where $ m $ is the number of edges and $ n $ is the number of states) by adapting Paige-Tarjan's partition refinement algorithm.
翻译:最近,自动机理论中引入了一种新范式。其主要思想是根据正则语言被排序的倾向性对其进行分类,从而在自动机理论与数据压缩之间建立深刻联系 [J. ACM 2023]。这种参数化引出了两个正则语言层次结构:一个确定性层次结构和一个非确定性层次结构。虽然确定性层次结构已得到充分理解,但非确定性层次结构显得复杂得多。即使对于层次结构中最丰富且研究最深入的级别——对应于 Wheeler 语言类——亦是如此。在本文中,我们通过互模拟的视角研究 Wheeler 语言。我们首先证明标准的互模拟概念并不适用。然后,我们引入 Wheeler 互模拟,即尊重所考虑的 Wheeler 自动机凸结构的互模拟。尽管互模拟的性质与 Wheeler 互模拟的性质存在一些差异,但我们证明 Wheeler 互模拟可以导出一个唯一的最小 Wheeler 非确定性有限自动机(类似于标准互模拟)。特别地,在确定性情况下,我们得到了给定语言的最小 Wheeler 确定性自动机。我们还证明了由 Wheeler 互模拟导出的最小 Wheeler 非确定性有限自动机可以在线性时间内构建。这与标准互模拟形成对比,对于后者,通过适配 Paige-Tarjan 的划分细化算法,相应的最小非确定性有限自动机可以在 $ O(m \log n) $ 时间内构建(其中 $ m $ 是边数,$ n $ 是状态数)。