Known simulations of random access machines (RAMs) or parallel RAMs (PRAMs) by Boolean circuits incur significant polynomial blowup, due to the need to repeatedly simulate accesses to a large main memory. Consider two modifications to Boolean circuits: (1) remove the restriction that circuit graphs are acyclic and (2) enhance AND gates such that they output zero eagerly. If an AND gate has a zero input, it 'short circuits' and outputs zero without waiting for its second input. We call this the cyclic circuit model. Note, circuits in this model remain combinational, as they do not allow wire values to change over time. We simulate a bounded-word-size PRAM via a cyclic circuit, and the blowup from the simulation is only polylogarithmic. Consider a PRAM program $P$ that on a length $n$ input uses an arbitrary number of processors to manipulate words of size $\Theta(\log n)$ bits and then halts within $W(n)$ work. We construct a size-$O(W(n)\cdot \log^4 n)$ cyclic circuit that simulates $P$. Suppose that on a particular input, $P$ halts in time $T$; our circuit computes the same output within $T \cdot O(\log^3 n)$ gate delay. This implies theoretical feasibility of powerful parallel machines. Cyclic circuits can be implemented in hardware, and our circuit achieves performance within polylog factors of PRAM. Our simulated PRAM synchronizes processors by simply leveraging logical dependencies between wires.
翻译:对随机存取机(RAM)或并行随机存取机(PRAM)通过布尔电路进行模拟的已知方法,由于需要反复模拟对大容量主存的访问,会产生显著的伪多项式膨胀。现对布尔电路进行两项修改:(1)取消电路图必须无环的限制;(2)增强与门功能,使其能在输入为零时立即输出零。若与门的一个输入为零,它将"短路"并输出零,无需等待第二个输入。我们将此称为循环电路模型。注意,该模型中的电路仍为组合逻辑电路,因其不允许导线值随时间变化。我们通过循环电路模拟有界字长的PRAM,且模拟产生的膨胀仅为多对数级。考虑一个PRAM程序P:在长度为n的输入上,它使用任意数量的处理器操作大小为Θ(log n)比特的字符,并在完成W(n)工作后终止。我们构造一个规模为O(W(n)·log⁴ n)的循环电路来模拟P。假设在特定输入下,P在时间T内终止;我们的电路在T·O(log³ n)门延迟内计算出相同输出。这证明了强大并行机的理论可行性。循环电路可硬件实现,且我们的电路性能达到PRAM的多对数因子范围。我们模拟的PRAM通过简单利用导线间的逻辑依赖关系实现处理器同步。