A parallel program together with the parallel hardware it is running on is not only a vehicle to solve numerical problems, it is also a complex system with interesting dynamical behavior: resynchronization and desynchronization of parallel processes, propagating phases of idleness, and the peculiar effects of noise and system topology are just a few examples. We propose a physical oscillator model (POM) to describe aspects of the dynamics of interacting parallel processes. Motivated by the well-known Kuramoto Model, a process with its regular compute-communicate cycles is modeled as an oscillator which is coupled to other oscillators (processes) via an interaction potential. Instead of a simple all-to-all connectivity, we employ a sparse topology matrix mapping the communication structure and thus the inter-process dependencies of the program onto the oscillator model and propose two interaction potentials that are suitable for different scenarios in parallel computing: resource-scalable and resource-bottlenecked applications. The former are not limited by a resource bottleneck such as memory bandwidth or network contention, while the latter are. Unlike the original Kuramoto model, which has a periodic sinusoidal potential that is attractive for small angles, our characteristic potentials are always attractive for large angles and only differ in the short-distance behavior. We show that the model with appropriate potentials can mimic the propagation of delays and the synchronizing and desynchronizing behavior of scalable and bottlenecked parallel programs, respectively.
翻译:并行程序及其运行的并行硬件不仅是解决数值问题的工具,更是一个具有有趣动态行为的复杂系统:并行进程的重新同步与去同步、空闲阶段的传播,以及噪声和系统拓扑的奇特效应仅是其中几个例子。我们提出了一种物理振荡器模型(POM)来描述交互并行进程的动态特性。受著名的Kuramoto模型启发,一个具有规律计算-通信周期的进程被建模为一个振荡器,该振荡器通过相互作用势与其他振荡器(进程)耦合。与简单的全连接不同,我们采用稀疏拓扑矩阵来映射程序的通信结构及进程间依赖关系,并针对并行计算中的两种不同场景——资源可扩展型和资源瓶颈型应用——提出了两种合适的相互作用势。前者不受内存带宽或网络争用等资源瓶颈限制,而后者则受其限制。与原始Kuramoto模型(具有小角度时吸引性的周期性正弦势)不同,我们的特征势在大角度时始终具有吸引性,仅在短距离行为上存在差异。我们证明,采用适当势函数的模型可以分别模拟可扩展型和瓶颈型并行程序中延迟的传播以及同步与去同步行为。