Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models. Special prompts, such as Chain-of-Thought, have even revealed previously unknown reasoning capabilities within these models. However, the progress of discovering effective prompts has been slow, driving a desire for general prompt optimization methods. Unfortunately, few existing prompt learning methods satisfy the criteria of being truly "general", i.e., automatic, discrete, black-box, gradient-free, and interpretable all at once. In this paper, we introduce metaheuristics, a branch of discrete non-convex optimization methods with over 100 options, as a promising approach to prompt learning. Within our paradigm, we test six typical methods: hill climbing, simulated annealing, genetic algorithms with/without crossover, tabu search, and harmony search, demonstrating their effectiveness in white-box and black-box prompt learning. Furthermore, we show that these methods can be used to discover more human-understandable prompts that were previously unknown in both reasoning and image generation tasks, opening the door to a cornucopia of possibilities in prompt optimization. We release all the codes in \url{https://github.com/research4pan/Plum}.
翻译:自大型语言模型问世以来,提示学习已成为优化和定制这些模型的主流方法。特殊的提示策略(如思维链)甚至揭示了这些模型先前未知的推理能力。然而,有效提示的发现进展缓慢,这推动了对通用提示优化方法的需求。遗憾的是,现有提示学习方法中鲜有能同时满足真正“通用”标准的方法,即兼具自动化、离散化、黑盒化、无梯度与可解释性。本文引入元启发式方法——一个包含百余种方案的离散非凸优化分支,作为提示学习的可行路径。在此框架下,我们测试了六种典型方法:爬山法、模拟退火、带/不带交叉的遗传算法、禁忌搜索与和声搜索,验证了它们在白盒与黑盒提示学习中的有效性。进一步研究表明,这些方法能够发现更具人类可理解性的新型提示策略,在推理与图像生成任务中均展现出前所未有的潜力,为提示优化开启了充满可能性的新篇章。所有代码已发布于 \url{https://github.com/research4pan/Plum}。