Learning to Transfer for Evolutionary Multitasking

Evolutionary multitasking (EMT) is an emerging approach for solving multitask optimization problems (MTOPs) and has garnered considerable research interest. The implicit EMT is a significant research branch that utilizes evolution operators to enable knowledge transfer (KT) between tasks. However, current approaches in implicit EMT face challenges in adaptability, due to the use of a limited number of evolution operators and insufficient utilization of evolutionary states for performing KT. This results in suboptimal exploitation of implicit KT's potential to tackle a variety of MTOPs. To overcome these limitations, we propose a novel Learning to Transfer (L2T) framework to automatically discover efficient KT policies for the MTOPs at hand. Our framework conceptualizes the KT process as a learning agent's sequence of strategic decisions within the EMT process. We propose an action formulation for deciding when and how to transfer, a state representation with informative features of evolution states, a reward formulation concerning convergence and transfer efficiency gain, and the environment for the agent to interact with MTOPs. We employ an actor-critic network structure for the agent and learn it via proximal policy optimization. This learned agent can be integrated with various evolutionary algorithms, enhancing their ability to address a range of new MTOPs. Comprehensive empirical studies on both synthetic and real-world MTOPs, encompassing diverse inter-task relationships, function classes, and task distributions are conducted to validate the proposed L2T framework. The results show a marked improvement in the adaptability and performance of implicit EMT when solving a wide spectrum of unseen MTOPs.

翻译：进化多任务优化（EMT）是解决多任务优化问题（MTOPs）的一种新兴方法，已引起广泛的研究关注。隐式EMT作为一个重要的研究分支，利用进化算子实现任务间的知识迁移（KT）。然而，当前隐式EMT方法因使用的进化算子数量有限，且未能充分利用进化状态来执行知识迁移，在适应性方面面临挑战。这导致隐式知识迁移在应对各类多任务优化问题时的潜力未能得到充分发挥。为克服这些局限，本文提出一种新颖的“学习迁移”（L2T）框架，旨在自动发现适用于当前多任务优化问题的高效知识迁移策略。该框架将知识迁移过程概念化为学习智能体在EMT过程中的一系列策略决策。我们提出了用于决策迁移时机与方式的行为形式化、包含进化状态信息特征的状态表示、关注收敛性与迁移效率增益的奖励机制，以及智能体与多任务优化问题交互的环境。采用行动者-评论家网络结构构建智能体，并通过近端策略优化算法进行训练。经训练的智能体可与多种进化算法集成，提升其解决各类新型多任务优化问题的能力。我们在涵盖多样化任务间关联、函数类别与任务分布的合成及实际多任务优化问题上进行了全面实证研究，验证了所提出的L2T框架的有效性。结果表明，在解决广泛未见过的多任务优化问题时，隐式EMT的适应性与性能均获得显著提升。