Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algorithms are known to discover diverse and robust solutions. By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce \texttt{LLMatic}, a Neural Architecture Search (NAS) algorithm. While LLMs struggle to conduct NAS directly through prompts, \texttt{LLMatic} uses a procedural approach, leveraging QD for prompts and network architecture to create diverse and high-performing networks. We test \texttt{LLMatic} on the CIFAR-10 and NAS-bench-201 benchmarks, demonstrating that it can produce competitive networks while evaluating just $2,000$ candidates, even without prior knowledge of the benchmark domain or exposure to any previous top-performing models for the benchmark. The open-sourced code is available in \url{https://github.com/umair-nasir14/LLMatic}.
翻译:大型语言模型(LLMs)已成为能够完成广泛任务的强大工具。其能力覆盖众多领域,并在代码生成领域产生了显著影响。本文提出利用LLMs的代码生成能力,为定义神经网络的代码引入有意义的变体。与此同时,质量多样性(QD)算法以发现多样且稳健的解决方案而著称。通过融合LLMs的代码生成能力与QD解决方案的多样性与稳健性,我们提出了\texttt{LLMatic}——一种神经架构搜索(NAS)算法。尽管LLMs难以直接通过提示完成NAS,但\texttt{LLMatic}采用程序化方法,利用QD优化提示与网络架构,生成多样且高性能的网络。我们在CIFAR-10和NAS-bench-201基准上测试\texttt{LLMatic},结果表明,即使用户对基准领域缺乏先验知识或未接触过任何先前顶尖模型,该算法仅评估$2,000$个候选架构即可生成具有竞争力的网络。开源代码见\url{https://github.com/umair-nasir14/LLMatic}。