Emulation-Completeness of Programming Languages

We study when a programming language can emulate programs written in that same language without delegating the guest program back to the host evaluator or compiler. We call this property emulation-completeness. The central observation is that Turing-completeness by itself is not enough: a self-emulator must not only compute the guest program's result, but must also account for the guest-visible state on which realistic programs depend, including control flow, exceptions, callbacks, timing, memory usage, and runtime metadata such as stack traces or line numbers. This paper is a systematization paper. Its contribution is not a new emulator implementation, but a precise vocabulary and a structured taxonomy for reasoning about self-emulation. We distinguish source-level evaluation from compiled-code emulation, define syntactic and compiled-code emulation-completeness, and separate weak from strong emulation-completeness according to how much observable runtime behavior must be preserved. We then organize the requirements into two classes: language-side requirements, which determine whether the guest semantics can be represented explicitly inside the language, and emulator-side requirements, which determine whether the resulting emulator can faithfully mask or reproduce relevant observations. The discussion is grounded by concrete examples, including publicly documented details from Erlang, where argument limits, bitstring pattern matching, and message reception expose subtle mismatches between direct execution and self-emulation. The resulting framework is intended as guidance for language designers, implementers of evaluators and emulators, and researchers interested in secure sandboxing, decompilation, and reflective execution.

翻译：我们研究编程语言何时能够在不将目标程序回传给宿主解释器或编译器的情况下，模拟用该语言编写的程序。我们将此特性称为仿真完备性。核心观察在于：图灵完备性本身并不足够——自仿真器不仅需要计算目标程序的结果，还必须考虑到实际程序所依赖的客方可观测状态，包括控制流、异常、回调、时序、内存使用以及堆栈跟踪或行号等运行时元数据。本文是一篇系统化论文，其贡献不在于提出新的仿真器实现，而在于建立了用于推理自仿真问题的精确词汇和结构化分类体系。我们区分了源码级求值与编译代码仿真，定义了语法仿真完备性和编译代码仿真完备性，并根据需保留的可观测运行时行为程度，将弱仿真完备性与强仿真完备性加以区分。随后，我们将需求组织为两大类：语言侧需求决定目标语义能否在语言内部显式表示，仿真器侧需求决定生成的仿真器能否忠实地屏蔽或复现相关观测。本文的讨论以具体实例为基础，包括Erlang语言中公开记录的细节——其中参数限制、位串模式匹配和消息接收暴露了直接执行与自仿真之间的微妙差异。由此形成的框架旨在为语言设计者、解释器与仿真器实现者，以及对安全沙箱、反编译和反射执行感兴趣的研究人员提供指导。