While RISC-V-based accelerators were initially designed with artificial intelligence applications in mind, they are increasingly being recognized as promising platforms for high performance scientific computing. In this work, we present three strategies for scaling an $N$-body code across multiple Tenstorrent Wormhole accelerators based on the RISC-V architecture. We assess the performance of these approaches by measuring both the execution time and the energy consumption required to complete a representative simulation, ultimately identifying the configuration that offers the most favorable balance between efficiency and performance.
翻译:暂无翻译