The desires for better prediction accuracy and higher execution performance in neural networks never end. Neural architecture search (NAS) and tensor compilers are two popular techniques to optimize these two goals, but they are both limited to composing or optimizing existing manually designed operators rather than coming up with completely new designs. In this work, we explore the less studied direction of neural operator synthesis, which aims to automatically and efficiently discover novel neural operators with better accuracy and/or speed. We develop an end-to-end framework Syno, to realize practical neural operator synthesis. Syno makes use of a novel set of fine-grained primitives defined on tensor dimensions, which ensure various desired properties to ease model training, and also enable expression canonicalization techniques to avoid redundant candidates during search. Syno further adopts a novel guided synthesis flow to obtain valid operators matched with the specified input/output dimension sizes, and leverages efficient stochastic tree search algorithms to quickly explore the design space. We demonstrate that Syno discovers better operators with an average of $2.06\times$ speedup and less than $1\%$ accuracy loss, even on NAS-optimized models.
翻译:在神经网络领域,对更高预测精度和更强执行性能的追求永无止境。神经架构搜索(NAS)与张量编译器是优化这两个目标的两种主流技术,但二者均局限于组合或优化现有的人工设计算子,而非提出全新的设计。本工作中,我们探索了较少被研究的神经算子合成方向,其目标是自动、高效地发现具有更优精度和/或速度的新型神经算子。我们开发了一个端到端框架Syno,以实现实用的神经算子合成。Syno利用一组在张量维度上定义的细粒度原语,这些原语确保了多种期望特性以简化模型训练,并支持表达式规范化技术以避免搜索过程中的冗余候选。Syno进一步采用了一种新颖的引导式合成流程,以获取与指定输入/输出维度尺寸匹配的有效算子,并利用高效的随机树搜索算法快速探索设计空间。我们证明,即使在经过NAS优化的模型上,Syno也能发现更优的算子,其平均加速比达到$2.06\times$,且精度损失低于$1\%$。