Processors with extensible instruction sets are often used today as programmable hardware accelerators for various domains. When extending RISC-V and other similar extensible processor architectures, the task of designing specialized instructions arises. This task can be solved automatically by using instruction synthesis algorithms. In this paper, we consider algorithms that can be used in addition to the known approaches and improve the synthesized instruction sets by recomputing common operations (the result of which is consumed by multiple operations) of a program inside clustered synthesized instructions (common operations clustering algorithm), and by identifying redundant (which have equivalents among the other instructions) synthesized instructions (subsuming functions algorithm). Experimental evaluations of the developed algorithms are presented for the tests from the domains of cryptography and three-dimensional graphics. For Magma cipher test, the common operations clustering algorithm allows reducing the size of the compiled code by 9%, and the subsuming functions algorithm allows reducing the synthesized instruction set extension size by 2 times. For AES cipher test, the common operations clustering algorithm allows reducing the size of the compiled code by 10%, and the subsuming functions algorithm allows reducing the synthesized instruction set extension size by 2.5 times. Finally, for the instruction set extension from Volume Ray-Casting test, the additional use of subsuming functions algorithm allows reducing problem-specific instruction extension set size from 5 to only 2 instructions without losing its functionality.
翻译:可扩展指令集的处理器如今常被用作各类领域的可编程硬件加速器。在扩展RISC-V及其他类似可扩展处理器架构时,需要设计专用指令。这一任务可通过指令合成算法自动完成。本文进一步考虑了在现有方法基础上可用的算法:通过将程序中的公共操作(其结果被多个操作使用)整合至簇状合成指令(公共操作聚类算法),以及识别冗余合成指令(即存在等效其他指令的指令,采用包含函数算法),来改进合成指令集。针对密码学与三维图形领域的测试,给出了所开发算法的实验评估结果。在Magma加密测试中,公共操作聚类算法可使编译代码体积缩减9%,包含函数算法则使合成指令集扩展规模缩减至原有规模的二分之一。在AES加密测试中,公共操作聚类算法使编译代码体积缩减10%,包含函数算法使合成指令集扩展规模缩减至原有规模的2.5倍。最后,在体积光线投射测试的指令集扩展中,额外应用包含函数算法可将特定问题的指令扩展集规模从5条指令缩减至仅2条,且不丧失其功能。