The demands for higher performance and accuracy in neural networks (NNs) never end. Existing tensor compilation and Neural Architecture Search (NAS) techniques orthogonally optimize the two goals but actually share many similarities in their concrete strategies. We exploit such opportunities by combining the two into one and make a case for Kernel Architecture Search (KAS). KAS reviews NAS from a system perspective and zooms into a more fine-grained level to generate neural kernels with both high performance and good accuracy. To demonstrate the potential of KAS, we build an end-to-end framework, Canvas, to find high-quality kernels as convolution replacements. Canvas samples from a rich set of fine-grained primitives to stochastically and iteratively construct new kernels and evaluate them according to user-specified constraints. Canvas supports freely adjustable tensor dimension sizes inside the kernel and uses two levels of solvers to satisfy structural legality and fully utilize model budgets. The evaluation shows that by replacing standard convolutions with generated new kernels in common NNs, Canvas achieves average 1.5x speedups compared to the previous state-of-the-art with acceptable accuracy loss and search efficiency. Canvas verifies the practicability of KAS by rediscovering many manually designed kernels in the past and producing new structures that may inspire future machine learning innovations. For source code and implementation, we open-sourced Canvas at https://github.com/tsinghua-ideal/Canvas.
翻译:神经网络对更高性能和准确性的追求永无止境。现有的张量编译与神经架构搜索技术正交地优化这两个目标,但在具体策略上实则存在诸多相似性。我们通过将二者结合为统一框架,提出了内核架构搜索这一新范式。KAS从系统视角重新审视NAS,并聚焦于更细粒度的层次,以生成兼具高性能与高精度的神经内核。为验证KAS的潜力,我们构建了端到端框架Canvas,用于寻找可替代卷积的高质量内核。Canvas从丰富的细粒度算子集中采样,通过随机迭代方式构建新内核,并根据用户指定的约束条件进行评估。该框架支持内核内部张量维度的自由调整,采用两级求解器确保结构合法性并充分利用模型计算资源。实验表明,在常见神经网络中用生成的新内核替代标准卷积后,Canvas相比现有最优方法平均实现1.5倍加速,同时保持可接受的精度损失与搜索效率。Canvas通过复现历史上众多人工设计的经典内核,并生成可能启发未来机器学习创新的新型结构,验证了KAS的实用性。相关源代码与实现已开源:https://github.com/tsinghua-ideal/Canvas。