Given the recent impressive accomplishments of language models (LMs) for code generation, we explore the use of LMs as adaptive mutation and crossover operators for an evolutionary neural architecture search (NAS) algorithm. While NAS still proves too difficult a task for LMs to succeed at solely through prompting, we find that the combination of evolutionary prompt engineering with soft prompt-tuning, a method we term EvoPrompting, consistently finds diverse and high performing models. We first demonstrate that EvoPrompting is effective on the computationally efficient MNIST-1D dataset, where EvoPrompting produces convolutional architecture variants that outperform both those designed by human experts and naive few-shot prompting in terms of accuracy and model size. We then apply our method to searching for graph neural networks on the CLRS Algorithmic Reasoning Benchmark, where EvoPrompting is able to design novel architectures that outperform current state-of-the-art models on 21 out of 30 algorithmic reasoning tasks while maintaining similar model size. EvoPrompting is successful at designing accurate and efficient neural network architectures across a variety of machine learning tasks, while also being general enough for easy adaptation to other tasks beyond neural network design.
翻译:鉴于语言模型在代码生成方面近期取得的显著成就,我们探索将其作为自适应变异与交叉算子,应用于进化神经架构搜索算法。尽管仅通过提示方式,神经架构搜索对语言模型而言仍过于困难,但研究发现,将进化提示工程与软提示调优相结合的方法(我们称之为EvoPrompting),能够持续发现多样且高性能的模型。我们首先在计算高效的MNIST-1D数据集上验证了EvoPrompting的有效性:与人类专家设计的模型及朴素少样本提示相比,该方法生成的卷积架构变体在准确率和模型规模上均表现更优。随后,我们将该方法应用于CLRS算法推理基准上的图神经网络搜索,在保持相近模型规模的前提下,EvoPrompting成功设计了新颖架构,在30项算法推理任务中的21项上超越了当前最先进模型。EvoPrompting不仅能针对各类机器学习任务设计精准高效的神经网络架构,其通用性也使其易于适配除神经网络设计外的其他任务。