EvoPrompting: Language Models for Code-Level Neural Architecture Search

Given the recent impressive accomplishments of language models (LMs) for code generation, we explore the use of LMs as adaptive mutation and crossover operators for an evolutionary neural architecture search (NAS) algorithm. While NAS still proves too difficult a task for LMs to succeed at solely through prompting, we find that the combination of evolutionary prompt engineering with soft prompt-tuning, a method we term EvoPrompting, consistently finds diverse and high performing models. We first demonstrate that EvoPrompting is effective on the computationally efficient MNIST-1D dataset, where EvoPrompting produces convolutional architecture variants that outperform both those designed by human experts and naive few-shot prompting in terms of accuracy and model size. We then apply our method to searching for graph neural networks on the CLRS Algorithmic Reasoning Benchmark, where EvoPrompting is able to design novel architectures that outperform current state-of-the-art models on 21 out of 30 algorithmic reasoning tasks while maintaining similar model size. EvoPrompting is successful at designing accurate and efficient neural network architectures across a variety of machine learning tasks, while also being general enough for easy adaptation to other tasks beyond neural network design.

翻译：鉴于近期语言模型（LM）在代码生成方面取得的瞩目成就，我们探索将LM作为进化神经架构搜索（NAS）算法中的自适应变异和交叉算子。尽管NAS仍是一项对LM而言仅通过提示难以独立完成的挑战性任务，但我们发现，将进化式提示工程与软提示调优相结合（我们称之为EvoPrompting方法）能够持续发现多样且高性能的模型。我们首先在计算高效的MNIST-1D数据集上验证了EvoPrompting的有效性：该方法生成的卷积架构变体在准确率和模型规模上均优于人类专家设计的方案以及简单的少样本提示方法。随后，我们将该方法应用于CLRS算法推理基准上的图神经网络搜索，发现EvoPrompting能够在30个算法推理任务中的21个上设计出超越当前最先进模型的新型架构，同时保持相似的模型规模。EvoPrompting成功为多种机器学习任务设计了精准高效的神经网络架构，且具备足够通用性，可轻松适配神经网络设计以外的其他任务。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日